文档介绍:数据发布中面向多敏感属性的隐私保护方法* 本课题得到新世纪优秀人才支持计划(NCET-06-0290),国家自然科学基金(60503036)和霍英东教育基金会青年教师优选资助课题(104027),女,1973年生,博士,副教授,,女,1984年生,硕士研究生,,男,1972年生,博士研究生,主要研究领域为P2P, ,男,1962年生,教授,博士生导师,主要研究领域为分布式数据库,Web服务,数据流.
摘要现有的隐私数据发布技术通常关注单敏感属性数据,,继承了基于有损连接对隐私数据进行保护的思想,提出了针对多敏感属性隐私数据发布的多维桶分组技术—Multi-Sensitive Bucketization(MSB). 为了避免高复杂性的穷举方法,首先提出三种不同的线性时间的贪心算法:最大桶优先算法(MBF),最大单维容量优先算法(MSDCF)和最大多维容量优先算法(MMDCF).另外,针对实际应用中发布数据的重要性差异,,. 加权多维桶分组技术对数据拥有者定义的重要信息的可发布性达到70%以上.
关键词数据发布;数据隐私;多敏感属性;有损连接;l-多样性
中图法分类号 TP309
Privacy Preserving Approaches for Multiple Sensitive Attributes
in Data Publishing
YANG Xiao-Chun+, WANG Ya-Zhe, WANG Bin, YU Ge
(School of Information Science and Engineering, Northeastern University, Shenyang 110004)
Abstract Current privacy preserving data publishing techniques concentrate on tables with only one sensitive attribute. However, most of the real-world applications contain multiple sensitive attributes. Directly applying the existing single-sensitive-attribute privacy preserving techniques often causes unexpected private information disclosure. In this paper, we firstly discuss the problem of secure publishing data when sensitive data contains multi attributes, and then propose a multi-dimensional bucket grouping approach on the idea of lossy join, called Multi-Sensitive Bucketization (MSB). In order to avoid exhausting search, three specific line-time greedy based MSB algorithms are proposed, which are
maximal-bucket first algorithm (MBF), maximal single-dimension-capacity first algorithm (MSDCF), and maximal multi-dimension-capacity first (MMDCF) algorithm. In addition, according to the differences among published data, a weighted MSB approach is further proposed. Experimental results on the real-world datasets show that the average grouping deviations of the proposed MSB methods were not more than 0