DP-PartFIM: Frequent Itemset Mining using Differential Privacy and Partition

作者:刘忆宁 来源:【数据科学与人工智能学院】 添加时间:2024-10-10 浏览:

编号:WZUT-2024-20

标题:DP-PartFIM: Frequent Itemset Mining using Differential Privacy and Partition

入藏号:WOS: xxxx

中科院期刊分区:2区

本院作者:刘忆宁(通讯作者)

来源出版物:IEEE Transactions on Emerging Topics in Computing卷: xx 出版年:2024

关键词:Itemset mining; differential privacy; database partition; vertical database

代表图:


Figure. DP-PartFIM explanatory example.


摘要:

Itemset mining is a popular data mining technique for extracting interesting and valuable information from large datasets. However, since datasets contain sensitive private data, it is not permitted to directly mine the data or share the mining results. Previous privacy-preserving frequent itemset mining research was not efficient because of the use of privacy budgets or long transaction truncation strategies, which are impractical for large datasets. In this paper, we propose a more efficient partition mining technology, DP-PartFIM, based on differential privacy, which protects privacy while mining data. DP-PartFIM uses partition mining to mine frequent itemsets and constructs vertical data storage formats for each partition, which makes the algorithm equally efficient for large datasets. To protect data privacy, DP-PartFIM adds Laplace noise to support candidate itemsets. The experimental results show that, compared with the classical privacy-preserving itemset mining methods, DP-PartFIM better guarantees data utility and privacy.


链接:https://doi.org/10.1109/TETC.2024.3443060