New research into web-tracking techniques has found some websites using audio fingerprinting for identifying and monitoring web users.
During a scan of one million websites, researchers at Princeton University have found that a number of them use the AudioContext API to identify an audio signal that reveals a unique browser and device combination.
“Audio signals processed on different machines or browsers may have slight differences due to hardware or software differences between the machines, while the same combination of machine and browser will produce the same output,” the researchers explain.
The method doesn’t require access to a device’s microphone, but rather relies on the way a signal is processed. The researchers, Arvind Narayanan and Steven Englehardt, have published a test page to demonstrate what your browser’s audio fingerprint looks like.
“Using the AudioContext API to fingerprint does not collect sound played or recorded by your machine. An AudioContext fingerprint is a property of your machine’s audio stack itself,” they note on the test page.
The technique isn’t widely adopted but joins a number of other approaches that may be used in conjunction for tracking users as they browse the web.
For example, one script that they found combined a device’s current charge level, a canvas-font fingerprint and a local IP address derived from WebRTC, the framework for real-time communications between two browsers.
The researchers found 715 of the top one million websites are using WebRTC to discover the local IP address of users. Most of these are third-party trackers.
Another more widely used method is fingerprinting based on the HTML Canvass API, which aims to deduce the fonts installed on a browser. They found 3,250 first-party sites using this technique.
Meanwhile, Canvass fingerprinting was found on 14,371 sites with scripts loaded from 400 different domains. The researchers analysed canvass fingerprinting in 2014, and note three changes since then.
“First, the most prominent trackers have by and large stopped using it, suggesting that the public backlash following that study was effective. Second, the overall number of domains employing it has increased considerably, indicating that knowledge of the technique has spread and that more obscure trackers are less concerned about public perception. Third, the use has shifted from behavioral tracking to fraud detection, in line with the ad industry’s self-regulatory norm regarding acceptable uses of fingerprinting.”
The other key finding, which may be good news depending on your attitude to Google, Facebook and Twitter, is that the number of third-party trackers that users will encounter on a daily basis is small.
“All of the top five third parties, as well as 12 of the top 20, are Google-owned domains. In fact, Google, Facebook, and Twitter are the only third-party entities present on more than 10 percent of sites,” the researchers note.
The researchers say their data suggests there has been a consolidation in the market for third-party tracking, which contrasts to the perception that there has been an explosion in third-party trackers. And that could be good news in terms of pressuring the industry to make privacy-enhancing improvements.
“For 100 or so third parties that are prevalent on one percent or more of sites, we might expect that they are large enough entities that their behavior can be regulated by public-relations pressure and the possibility of legal or enforcement actions,” they argued.