Privacy Risk Against Composition Attack
Keywords:
Privacy, Composition attack, AnonymizationAbstract
Privacy in multiple independent data publishing has attracted considerable research interest in recent years. Although each published data set poses a small privacy risk to individuals, recent studies show that this risk increases when different organizations have some common records and they publish their data sets independently without any coordination with each other. If an individual can be detected from disparate providers, the individual's privacy is compromised. This type of privacy breach is called composition attack. A few studies have done to mitigate this attack. However, none of them studies the risk against this attack from a single data set. Motivated by this gap, this paper uses a probabilistic model to estimate the risk against composition attack from a single data. Therefore, a publisher can predict the risk against composition attack of a data set prior to publication. To evaluate the effectiveness of our model we also perform empirical analysis to show that the estimated risk can give us the pattern of the real risk.
Downloads
References
Muzammil M Baig, Jiuyong Li, Jixue Liu, and Hua Wang. Cloning for Privacy Protection in Multiple Independent Data Publications. In CIKM, pages 885-894,2011.
Muzammil M. Baig, Jiuyong Li, Hua Wang, and Jixue Liu. Studying genotype-phenotype attack on k-anonymised medical and genomic data. In CRPIT, pages159-166, 2009.
Bee-chung Chen, Kristen Lefevre, and Raghu Ramakrishnan. Privacy Skyline :Privacy with Multidimensional Adversarial Knowledge. In VLDB, pages 770-781, Vienna, Austria, 2007. ACM.
Bee-Chung Chen, Kristen Lefevre, and Raghu Ramakrishnan. Adversarial-knowledge dimensions in data privacy. The VLDB Journal, 18(2):429-467, April,2009.
Cynthia Dwork. Differential Privacy. In ICALP, pages 1-12. Springer, bugliesi,edition, 2006.
Benjamin C. M. Fung, Ke Wang, Ada Wai-Chee Fu, and Jian Pei. Anonymity for continuous data publishing. In EDBT '08, pages 264-275, New York, New York,USA, 2008.
Srivatsava Ranjit Ganta, Shiva Prasad, and Adam Smith. Composition Attacks and Auxiliary Information in Data. In SIGKDD, pages 265-273, 2008.
Michael Hay, Vibhor Rastogi, Gerome Miklau, and Dan Suciu. Boosting the Accuracy of Differentially Private Histograms Through Consistency. Proceedings of the VLDB Endowment, 3(1):1021-1032, 2010.
Wei Jiang and Chris Clifton. A secure distributed framework for achieving k-anonymity. The VLDB Journal, 15(4):316-333, August 2006.
Daniel Kifer and Ashwin Machanavajjhala. A rigorous and customizable framework for privacy. In PODS '12, pages 77-88, New York, NY, USA, 2012.
K. LeFevre, D.J. DeWitt, and R. Ramakrishnan. Mondrian Multidimensional K-Anonymity. In ICDE, pages 25-25, 2006.
Ninghui Li, Tiancheng Li, and Suresh Venkatasubramanian. t-Closeness : Privacy Beyond k-Anonymity and l-Diversity. In ICDE, number 3, pages 106-115, 2007.
Ashwin Machanavajjhala, Daniel Kifer, Johannes Gehrke, and Muthuramakrishnan Venkitasubramaniam. l-Diversity: Privacy Beyond k-Anonymity. ACM Trans. on
Knowledge Discovery from Data, 1(1):3es, March 2007.
B Malin. k-Unlinkability: A privacy protection model for distributed data. Data & Knowledge Engineering, 64(1):294-311, January 2008.
Bradley Malin. Secure construction of k-unlinkable patient records from distributed providers. Artif. Intell. Med., 48(1):29-41, January 2010.
Bradley Malin, Edoardo Airoldi, Samuel Edoho-eket, and Yiheng Li. Con_gurable Security Protocols for Multi-party Data Analysis with Malicious Participants. In
ICDE, number September, pages 533-544, 2004.
David J Martin, Daniel Kifer, Ashwin Machanavajjhala, Johannes Gehrke, and Joseph Y Halpern. Worst-Case Background Knowledge for Privacy-Preserving Data Publishing. In ICDE, pages 126-135, 2007.
Noman Mohammed, Rui Chen, Benjamin C M Fung, and Philip S Yu. Differentially Private Data Release for Data Mining. In SIGKDD, pages 493-501, 2011.
M. Ercan Nergiz, Maurizio Atzori, and ChristopherW. Clifton. Hiding the presence of individuals from shared databases. In SIGMOD, pages 665-676, New York, New
York, USA, 2007.
Latanya Sweeney. k-anonymity: A model for protecting privacy. Int'l Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5):1-14, 2002.
Ke Wang and C.M. Benjamin Fung. Anonymizing Sequential Releases . In ACM SIGKDD, pages 414-423, 2006.
W. Winkler. Advanced methods for record linkage. In Proceedings of the Selection on Survey Research Methods, Americal Statistical Society, pages 467-472, 1994.
Raymond Chi-wing Wong, Jiuyong Li, Ada Wai-chee Fu, and Ke Wang. (Alpha,k)-Anonymity : An Enhanced k-Anonymity Model for Privacy-Preserving Data Publishing. In ACM SIGKDD, pages 754-759, 2006.
R.C.-W. Wong, A.W.-C. Fu, Jia Liu, Ke Wang, and Yabo Xu. Global privacy guarantee in serial data publishing. In ICDE, pages 956-959, march 2010.
Xiaokui Xiao and Yufei Tao. m-Invariance : Towards Privacy Preserving Re-publication of Dynamic Datasets. In SIGMOD, pages 689-700, 2007.
Xiaokui Xiao, Guozhang Wang, Johannes Gehrke, and Thomas Je_erson. Differential Privacy via Wavelet Transforms. IEEE Trans. on Knowledge and Data
Engineering, 23(8):1200-1214, 2011.
J. Li, M. M. Baig, A. S. Sattar, X. Ding, J. Liu, M. W. Vincent, A hybrid approach to prevent composition attacks for independent data releases, Information Sciences 367-368 (2016).
A. S. Sattar, J. Li, J. Liu, R. Heatherly, B. Malin, A probabilistic approach to mitigate composition attacks on privacy in non-coordinated environments, Knowledge-Based Systems 67 (2014) 361- 372.
A. S. Sattar, J. Li, X. Ding, J. Liu, M. Vincent, A general framework for privacy preserving data publishing, Knowledge-Based Systems 54 (2013) 276-287.