TY - GEN
T1 - Distributed privacy preserving data collection
AU - Xue, Mingqiang
AU - Papadimitriou, Panagiotis D.
AU - Raïssi, Chedy
AU - Kalnis, Panos
AU - Pung, Hungkeng
N1 - KAUST Repository Item: Exported on 2020-10-01
PY - 2011
Y1 - 2011
N2 - We study the distributed privacy preserving data collection problem: an untrusted data collector (e.g., a medical research institute) wishes to collect data (e.g., medical records) from a group of respondents (e.g., patients). Each respondent owns a multi-attributed record which contains both non-sensitive (e.g., quasi-identifiers) and sensitive information (e.g., a particular disease), and submits it to the data collector. Assuming T is the table formed by all the respondent data records, we say that the data collection process is privacy preserving if it allows the data collector to obtain a k-anonymized or l-diversified version of T without revealing the original records to the adversary. We propose a distributed data collection protocol that outputs an anonymized table by generalization of quasi-identifier attributes. The protocol employs cryptographic techniques such as homomorphic encryption, private information retrieval and secure multiparty computation to ensure the privacy goal in the process of data collection. Meanwhile, the protocol is designed to leak limited but non-critical information to achieve practicability and efficiency. Experiments show that the utility of the anonymized table derived by our protocol is in par with the utility achieved by traditional anonymization techniques. © 2011 Springer-Verlag.
AB - We study the distributed privacy preserving data collection problem: an untrusted data collector (e.g., a medical research institute) wishes to collect data (e.g., medical records) from a group of respondents (e.g., patients). Each respondent owns a multi-attributed record which contains both non-sensitive (e.g., quasi-identifiers) and sensitive information (e.g., a particular disease), and submits it to the data collector. Assuming T is the table formed by all the respondent data records, we say that the data collection process is privacy preserving if it allows the data collector to obtain a k-anonymized or l-diversified version of T without revealing the original records to the adversary. We propose a distributed data collection protocol that outputs an anonymized table by generalization of quasi-identifier attributes. The protocol employs cryptographic techniques such as homomorphic encryption, private information retrieval and secure multiparty computation to ensure the privacy goal in the process of data collection. Meanwhile, the protocol is designed to leak limited but non-critical information to achieve practicability and efficiency. Experiments show that the utility of the anonymized table derived by our protocol is in par with the utility achieved by traditional anonymization techniques. © 2011 Springer-Verlag.
UR - http://hdl.handle.net/10754/564335
UR - http://link.springer.com/10.1007/978-3-642-20149-3_9
UR - http://www.scopus.com/inward/record.url?scp=79955104712&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-20149-3_9
DO - 10.1007/978-3-642-20149-3_9
M3 - Conference contribution
SN - 9783642201486
SP - 93
EP - 107
BT - Database Systems for Advanced Applications
PB - Springer Nature
ER -