To clarify before continuing, this latency is not a problem for video playback. When working as intended, any modern platform compensates for this latency by starting a video a little later than the audio; in theory, there would be perfect synchronisation between video and audio streams even if there was catastrophically bad Bluetooth latency.
The kinds of sounds for which this is a problem are those which are unpredictable. User-initiated sounds, which the platform has no way to know are coming, are the most commonly encountered example. Keyboard clicks or other UI sounds, accessibility features like VoiceOver, and game sound effects should be familiar as the kinds of sounds that often get mangled by audio latency. These sounds are a delight in a low latency context (for example, via the device speaker or wired earphones) but start to get clumsy and obstructive when delays are added. Obviously, when these sounds are nice-to-haves they can just be disabled (as I've done on everything with which I use wireless earphones), but when they're vital accessibility features wireless earphones can substantially lower the user experience.
My previous examination of this topic investigated the latencies of a few Bluetooth audio devices I had at the time. AirPods hadn't yet been released, but a later investigation of them found disappointing results. Now, three years and a couple of device generations later, I thought it would be good to see if things have improved on this front.
I also thought it would be good to give a quick overview of my testing methodology, which is fundamentally pretty straightforward. The core measurement I'm after is clear: how much time passes between a user triggering a sound, and hearing it? So, if I can record the exact times of these two events, I've got my figures. To do this, I place the Bluetooth device as close as possible to a shotgun microphone, and adjust its volume so it can be heard clearly by my audio recorder. I then pair the device being tested to a 2018 iPad Pro running iOS 13, and use an Apple Pencil to tap the screen and trigger sound events. The tap from the Pencil and the subsequent sound each produce a clear spike in waveform of the recorded audio. By measuring the time between these spikes, I can establish a pretty accurate measurement of the audio latency.
I use two pieces of software to play the audio I'm measuring. First, the default iOS Keyboard as it's probably the most common place people encounter this issue. Second, a game I developed called Tapt. This is a good benchmark because I've written the game with a specific need for low audio latency, and I'm familiar with its technical underpinnings. One of the key differences between it and the iOS Keyboard is that Tapt starts and keeps up an 'audio session' continuously, so the Bluetooth device is always primed and ready to play a sound as quickly as possible. The iOS Keyboard does not do this, which explains why the first sound it produces takes much longer to appear that the latter ones; the audio engine needs to wake up first.
With each combination of device and software, I trigger ten sounds in a row, using a metronome set to 90bpm to keep things consistent. For the keyboard run, I discard the first tap for the reasons given above (in a context where latency really matters, developers can easily do what I've done with Tapt). Finally, averaging these 19 measurements gives the latency figures which I quote below. While I don't have my original AirPods to re-test, this methodology is pretty much unchanged from before so I'm going to include those results too. You can take a look at the entirety of my data here.
Of course, it should be noted that this measurement isn't just the audio latency; some of the delay comes from input latency, and more still from the software that runs in between that and the eventual output. It doesn't really matter as far as the end-user is concerned; latency is latency wherever it comes from. However, I've included a measurement of the on-device speakers in the chart, and the yellow line demarcating that threshold extends across the whole chart.
Looking to the AirPods first, there's a very encouraging trend occurring. They drop from 274ms to 178ms going from the first to second generation, and the AirPods Pro take it down even further, to 144ms. While a 130ms reduction may not seem like a lot, the perceptual difference from this makes the AirPods Pro tantalisingly close to seamless.
Keyboard clicks are near enough to their corresponding keypresses that they feel like they're actually related to them, not just the cacophony of blips they had seemed before. Tapt is playable, but only just; there's still additional cognitive load caused by the delay, which I'm sure affects other rhythm-based games equally, and risks upsetting the playability of games that rely heavily on audio cues. However, it's a lot better, and it looks like things are heading very much in the right direction.
The results for the Beats Studio 3 headphones aren't too surprising. They use the same W1 chip as the original AirPods, with similar results. The Sony WH-CH700N (catchy name...) is a little better, but those results are there more for context than anything else; they're essentially representative of the general state of Bluetooth headphones currently (I also tested a third-gen Amazon Echo, and a JBL Bluetooth speaker, which were broadly similar to the Sony and Beats).
If it's possible for the trend line to continue in the same direction, the next generation or two of AirPods will be very exciting. Not being a VoiceOver user, I'm unsure how much AirPods Pro improve its user experience in real terms, but I think this general trend can only be for the good. Similarly, for mobile gaming and general user experience, this trend means that what is, in my opinion, the primary downside of Bluetooth earphones may be gradually disappearing.
Another interesting area for which audio latency is a huge problem is that of content creation, particularly audio and video creation and editing. I hadn't mentioned this before because I suspect vanishingly few pros even consider Bluetooth earphones an option, but it's possible that a few more generations of continued latency reduction may start to make this feasible. Their status as the lowest latency Bluetooth earphones notwithstanding, the AirPods Pro make for a deeply unsettling experience when using them as monitors to play piano in Logic Pro; there's still far too much delay to make for a comfortable experience (and I'm not alone in thinking similar). They are, however, just about usable when editing music or video, and shaving a few dozen more milliseconds off this each generation would fast make them a preferable option over wired earphones.
All in all, I was very pleasantly surprised to see the level of improvement in AirPods. While I'm aware Apple has touted the latency improvements of newer AirPods, I had retained a certain skepticism over how much better they would actually be. Knowing very little about the underlying technology, I was also worried whether significant improvements would actually be possible within the Bluetooth standard, but this seems to at least somewhat allay my fears (though I'd welcome some replication of my tests with non-iOS devices to see if any special Apple magic is happening).