Tracking on the Internet
When the browser requests a third-party resource embedded on a web page, the snippet below is a subset of the HTTP headers sent by the browser to the third party.
The combination of the cookie and the referrer makes third-party tracking possible. Incidentally, the HTTP protocol specification misspells ‘referrer’
Third party online tracking: sites other than the one you’re visiting (typically invisible) compiling profiles of your browsing history.
Behavioral targeting: profiles based on user’s past activity help ad exchanges serve targeted ads based on real-time ad auctions.
Web Tracking Methods
Placing data in the browser
Options: HTTP cookies, HTTP auth, HTTP etags, content cache, IE userdata, HTML5 protocol & content handlers, HTML5 storage, Flash cookies, Silverlight storage, TLS session ID & resume, Browsing history,
window.name, HTTP STS, DNS cache, …
Numerous web APIs allow placing data in the browser (directly or indirectly) and all of these can be used for uniquely identifying the user and hence tracking their browsing.
Involves observing the browser’s behavior, e.g. user-agent, browser plugins, clock skew, list of installed fonts, cookies enabled?, browser add-ons, screen resolution, …
When these attributes are combined, different devices/browsers will have different fingerprints. Fingerprinting leaves no trace that the user is being tracked. Unlike cookies, users can’t see or control fingerprinting.
Two devices can be linked to the same user if:
- User logs in with the same credentials from both devices
- User visits the same/similar set of websites on both devices
- User travels with two portable devices
- TV ad emits ultrasound (inaudible) binary signal that encodes a unique ID
- Viewer’s smartphone app listens in the background
- When ultrasound ID is detected, reports that ad has been watched.
Beacon-based tracking is not as widespread as fingerprinting and cross-device tracking. It was implemented by SilverPush and incorporated in a small number of apps.
In recent versions of iOS/Android, apps can’t record audio in the background without user awareness/consent.
Merging online and offline databases
Scenario: retailer wants to target shoppers with ads when they browse online
- Consumer shops at retail store, provides email address
- Store uploads list of consumer IDs to online advertiser
- Consumer logs in to (say) news website using email address
- Third-party tracker links the user to retail DB
- Ads are served via a cookie that follows the user around
Machine Learning and Inference
Using Facebook likes, one can meaningfully predict sexual orientation, ethnicity, religious and political views, personality traits, intelligence, happiness, use of addictive substances, parental separation, age and gender.
Third-Party Web Tracking: Policy and Technology. Mayer, Jonathan; Mitchell, John. http://cyberlaw.stanford.edu/files/publication/files/trackingsurvey12.pdf . 2012.
Display LUMAscape. LUMA Partners LLC. https://lumapartners.com/content/lumascapes/display-ad-tech-lumascape/ .
Pixel perfect: Fingerprinting canvas in HTML5. Mowery, Keaton; Shacham, Hovav. https://hovav.net/ucsd/dist/canvas.pdf . 2012.
Cross-Device Tracking: Measurement and Disclosures. Brookman, Justin; Rouge, Phoebe; Alva, Aaron; Yeung, Christina. https://petsymposium.org/2017/papers/issue2/paper29-2017-2-source.pdf . 2017.
How Companies Learn Your Secrets. Charles Duhigg. New York Times. https://www.nytimes.com/2012/02/19/magazine/shopping-habits.html . https://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-before-her-father-did . Feb 16, 2012.
Private traits and attributes are predictable from digital records of human behavior. Michal Kosinski; David Stillwell; Thore Graepel. https://www.pnas.org/content/pnas/110/15/5802.full.pdf . Feb 13, 2013.
Deep neural networks are more accurate than humans at detecting sexual orientation from facial images. Yilun Wang; Michal Kosinski. https://www.semanticscholar.org/paper/Deep-Neural-Networks-Can-Detect-Sexual-Orientation-Wang-Kosinski/af30a2394c620132884bb98c78b6b9e46c791482 . 2017.
Tor users, beware: 'Scheme flooding' technique may be used to deanonymize you. Thomas Claburn. https://www.theregister.com/2021/05/14/browser_fingerprinting_flaw/ . https://it.slashdot.org/story/21/05/14/2044200/scheme-flooding-technique-may-be-used-to-deanonymize-you . May 14, 2021.
The case against behavioral advertising is stacking up. Natasha Lomas. https://techcrunch.com/2019/01/20/dont-be-creepy/ . Jan 20, 2019. Accessed Jun 20, 2021.