Differential Privacy

Dated Jan 22, 2019; last modified on Sun, 14 Mar 2021

A data release mechanism is differentially private if any computation performed on the released data yields essentially the same result if a single data point is added/removed.

More specifically, let \(Q\) be any (probabilistic) query function. \(Q\) is \(\epsilon\)-differentially private if for all databases \(B\), \(B'\) that differ in one item, and for all functions \(F\), and all values \(y\):

$$ \mathbb{P}\{F(Q(B)) = y\} \le e^{\epsilon} \cdot \mathbb{P}\{F(Q(B')) = y\} $$

\(\epsilon\) is a privacy parameter. Increasing \(\epsilon\) improves accuracy, but reduces privacy.

Cynthia Dwork, the pioneer of Differential Privacy, has just won the 2020 Knuth Prize .

See Differential Privacy Blog Series | NIST for more discussions on Differential Privacy.

Random Link ¯\_(ツ)_/¯
Mar 25, 2019»Motivation for Differential Privacy in Datasets 2 min; updated Mar 14, 2021
Jun 1, 2010»Myths and Fallacies of 'Personally Identifiable Information' 3 min; updated Sep 5, 2022
May 7, 2017»Differential Privacy: A Primer for a Non-technical Audience 4 min; updated Sep 5, 2022