Trackers are Not Created Equal
By Julia Kieserman
Imagine you are someone who struggles with addiction, or the loved one of someone who does, and are seeking confidential help. You find a promising resource online and fill out an interest form, which includes your email address, so you can be contacted. In the subsequent days, you find yourself getting advertisements for addiction products and services on social media. Now imagine a few years pass and, after hard work in treatment and a successful recovery, you are turned down for a government job because of your history with addiction. How did this happen?
Trackers may be silently lurking in the background of websites, sending your personal information to tech companies, who then use this data for targeted advertising and to enhance personalization experiences on their platforms. This type of data collection isn’t limited to addiction resource websites. Big tech companies are collecting increasingly detailed information about us across all websites, including those that handle especially sensitive and legally protected data, like financial aid applications, doctor appointments, and tax filings.
Here’s how it works. Companies that collect information for a legitimate use, like financial aid or health service providers, share that data with big tech companies through a technology known as tracking pixels. Two of the most popular are owned by Meta and Google. This data enables Meta and Google to track specific individuals across different browsers or devices, known as cross-context tracking. Additional data points make advertising more personalized and therefore more lucrative, both to Google and Meta in their role as advertising entities and to the companies that use their services to target new potential customers or track the behavior of existing ones. Tech companies can also use this data to create more personalized experiences on their platforms, which may make them even more effective at keeping people active and engaged with their products.
To facilitate this data exchange, website administrators, who may be employees of the company that owns the website or an external consulting service, must install tracking pixels and explicitly configure them to collect personally identifiable information (PII). Website administrators are responsible for configuring tracking pixels in a way that upholds the legal protections detailed by HIPPA (for health data) and the Gramm-Leach Bliley Act (for finance data), if applicable. However, the tech companies that build and maintain these pixels - in this case Meta and Google - also have a responsibility to ensure their technology tools are easy to understand and configure in a legally compliant way.
In our paper, “Tracker Installations Are Not Created Equal,” we investigated the resources that Meta and Google provide to website administrators to help them understand and use tracking pixels appropriately. We further measured how many popular websites, with a particular focus on health and finance websites, have actually configured tracking pixels by injecting a form with fabricated data on each website and then monitoring network traffic to detect if our fabricated data was being sent to Meta or Google.
Meta and Google documentation and interface use language and behavior that, like dark patterns, may coerce or trick website administrators into making less private choices. For example, both entities incorrectly state that hashing, a one-way cryptographic function that tracking pixels use to obfuscate data, is sufficient for privacy. This notion has been explicitly debunked twice by the U.S. Federal Trade Commission (FTC), first in 2012 and then again in 2024, as well as by other privacy researchers. However, Meta and Google remain incentivized to collect data in this fashion, as it allows them to identify individuals by matching the hash of collected identifiers to hashes of their own existing account data. The FTC’s stance against hashing will likely need to be codified and enforced to have a meaningful impact on both the documentation and actual behavior of Meta and Google tracking pixels.
As another example, consider Meta’s user interface, which web administrators can use to install and configure the tracking pixel. When a website administrator toggles data collection on for a tracking pixel (referred to as “automatic advanced matching” by Meta), the user interface will automatically select the maximum number of personal identifiers it can collect. If the website administrator wants to send only some identifiers or a single identifier, say a website visitor’s country, they must manually deselect all other data types. This is an example of the least private defaults pattern, which shares a lot of characteristics with the bad defaults dark pattern, and relies on people's natural tendency to leave configurations unmodified.
Our research suggests that these design decisions may impact the configuration choices made by website administrators. We found that 62% of websites that used Meta tracking pixels collected personal data. By comparison, of websites that use Google’s tracking pixel, which has no equivalent auto-selection in the user interface, only 11% collected personal data.
Beyond the documentation and interface, we evaluated the technical behavior of each tracking pixel and observed some concerning behavior with the way Meta’s tracking pixel collects data. Typically, a tracking pixel listens on a page for someone to submit a form - like a login, order checkout, or email subscription form - and collects data entered in the form fields. It relies on the structure of the website to differentiate a form button click from other behaviors on a webpage, like clicking the back button or clicking a “show more” button. However, the Meta Pixel at times is unable to make this distinction, so it may collect data entered into forms even if that form is never submitted.
While both Meta and Google deploy technical restrictions on websites that self-identify as belonging to the health or finance website category, we found that many websites that are clearly health or finance incorrectly identified their category and thus got around the technical restrictions. This issue was more significant for Google than Meta. Nearly the same percentage of websites using Google’s tracking pixel collected personal data in health and finance categories as compared to other categories, suggesting that Google’s technical restrictions had little impact. There needs to be more accountability - both of websites and of big tech companies that document the significance of category identification - to ensure websites correctly categorize themselves so the technical restrictions work as intended.
While health and finance data are protected by regulation in the United States, it is insufficient at keeping this data, as well as other potentially sensitive information, out of the hands of the tech companies that install and configure these tracking pixels. More stringent enforcement of HIPPA and the Gramm-Leach Bliley Act will be necessary to ensure that Meta and Google are not silently hovering in the background while we perform essential tasks, like booking appointments, seeking confidential services or filing taxes online. State Attorneys General should consider using their consumer protection authorities to ensure that trackers don’t continue to facilitate the erosion of our health and financial privacy.
Since the FTC has already published guidance that Meta’s and Google’s use of hashing for privacy is insufficient to protect user privacy, the agency could consider a clarifying order about tracker configurations which run contrary to FTC guidance. Further, the misleading configuration prompts we saw in our study highlight the importance of considering how to protect the public from the use of dark or less-private patterns. Given that Meta and Google trackers are used on a significant number of websites in the U.S. (including health and financial sites), the FTC should consider using its authority to protect consumers against this behavior, whether this is an unfair method of competition or deceptive act or practice.