1
0
9x0rg.com/content/posts/infosec/privacy-fails-through-data-aggregation.md
2023-11-07 15:55:37 +01:00

25 lines
2.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "Privacy fails through data aggregation"
date: 2016-10-13T10:56:00+06:00
draft: false
tags: ["data privacy","surveillance","three-letter-agencies"]
author: "Olivier Falcoz"
hidemeta: false
ShowReadingTime: true
ShowPostNavLinks: true
showtoc: false
cover:
image: "/images/"
alt: "<alt text>"
caption: "<text>"
---
“Aggregating” or combining data from multiple sources can actually reveal surprisingly specific information. You might not work for the Pentagon, but your data can be aggregated in the same way to [de-anonymize](https://en.wikipedia.org/wiki/De-anonymization) you. Heres a small collection of these surprising privacy failures:
* The Classic Paper [Simple Demographics Often Identify People Uniquely](http://dataprivacylab.org/projects/identifiability/paper1.pdf) shows that knowing just birth date, gender, and zip code is enough to uniquely identify most people.
* Netflix Debacle An *anonymous* Netflix dataset was [de-anonymized by correlating it with the IMDB](https://www.cs.utexas.edu/~shmat/shmat_oak08netflix.pdf) database.
* Social Exposure [De-anonymizing social networks](https://www.cs.utexas.edu/~shmat/shmat_oak09.pdf) (by Arvind Narayanan) demonstrates how an *anonymous* Twitter graph can be re-identified using Flickr for auxiliary information.
* Your Words Betray You Your choice of words in writing [can be analyzed](http://33bits.org/2012/02/20/is-writing-style-sufficient-to-deanonymize-material-posted-online/) to uniquely identify you according to [On the feasibility of Internet-Scale Author Identification](http://randomwalker.info/publications/author-identification-draft.pdf).
* Location, Location, Location The traces of your GPS location app, even your approximate location, is pretty unique. Outlined in [Unique in the crowd, the privacy bounds of human mobility](http://www.nature.com/articles/srep01376).
* Bitcoin is often thought of as an anonymous currency, but its [surprisingly non-anonymous](https://coincenter.org/2015/01/anonymous-bitcoin/), considering its reputation. This is because a lot of information is contained in the *public ledger* that records all transactions. See also [An analysis of Anonymity in the Bitcoin System](http://arxiv.org/pdf/1107.4524).
Source: [Tozny Blog](https://tozny.com/blog/10-unnerving-privacy-fails-thru-data-aggregation/)