1
0
9x0rg.com/content/posts/infosec/privacy-fails-through-data-aggregation.md

25 lines
2.3 KiB
Markdown
Raw Normal View History

2023-11-07 15:55:37 +01:00
---
title: "Privacy fails through data aggregation"
date: 2016-10-13T10:56:00+06:00
draft: false
tags: ["data privacy","surveillance","three-letter-agencies"]
author: "Olivier Falcoz"
hidemeta: false
ShowReadingTime: true
ShowPostNavLinks: true
showtoc: false
cover:
image: "/images/"
alt: "<alt text>"
caption: "<text>"
---
“Aggregating” or combining data from multiple sources can actually reveal surprisingly specific information. You might not work for the Pentagon, but your data can be aggregated in the same way to [de-anonymize](https://en.wikipedia.org/wiki/De-anonymization) you. Heres a small collection of these surprising privacy failures:
* The Classic Paper [Simple Demographics Often Identify People Uniquely](http://dataprivacylab.org/projects/identifiability/paper1.pdf) shows that knowing just birth date, gender, and zip code is enough to uniquely identify most people.
* Netflix Debacle An *anonymous* Netflix dataset was [de-anonymized by correlating it with the IMDB](https://www.cs.utexas.edu/~shmat/shmat_oak08netflix.pdf) database.
* Social Exposure [De-anonymizing social networks](https://www.cs.utexas.edu/~shmat/shmat_oak09.pdf) (by Arvind Narayanan) demonstrates how an *anonymous* Twitter graph can be re-identified using Flickr for auxiliary information.
* Your Words Betray You Your choice of words in writing [can be analyzed](http://33bits.org/2012/02/20/is-writing-style-sufficient-to-deanonymize-material-posted-online/) to uniquely identify you according to [On the feasibility of Internet-Scale Author Identification](http://randomwalker.info/publications/author-identification-draft.pdf).
* Location, Location, Location The traces of your GPS location app, even your approximate location, is pretty unique. Outlined in [Unique in the crowd, the privacy bounds of human mobility](http://www.nature.com/articles/srep01376).
* Bitcoin is often thought of as an anonymous currency, but its [surprisingly non-anonymous](https://coincenter.org/2015/01/anonymous-bitcoin/), considering its reputation. This is because a lot of information is contained in the *public ledger* that records all transactions. See also [An analysis of Anonymity in the Bitcoin System](http://arxiv.org/pdf/1107.4524).
Source: [Tozny Blog](https://tozny.com/blog/10-unnerving-privacy-fails-thru-data-aggregation/)