TY - GEN
T1 - Privacy-preserving association rule mining in large-scale distributed systems
AU - Schuster, Assaf
AU - Wolff, Ran
AU - Gilburd, Bobi
PY - 2004
Y1 - 2004
N2 - Data privacy is a major concern that threatens the widespread deployment of Data Grids in domains such as health-care and finance. We propose a unique approach for obtaining knowledge - by way of a data mining model - from a Data Grid, while ensuring that the data is cryptographically safe. This is made possible by an innovative, yet natural generalization for the accepted trusted third party model and a new privacy-preserving data mining algorithm that is suitable for Grid-scale systems. The algorithm is asynchronous, involves no global communication patterns, and dynamically adjusts to changes in the data or to the failure and recovery of resources. To the best of our knowledge, this is the first privacy-preserving mining algorithm to possess these features. Simulations of thousands of resources prove that our algorithm quickly converges to the correct result while using reasonable communication. The simulations also prove that the effect of the privacy parameter on both the convergence time and the number of messages, is logarithmic.
AB - Data privacy is a major concern that threatens the widespread deployment of Data Grids in domains such as health-care and finance. We propose a unique approach for obtaining knowledge - by way of a data mining model - from a Data Grid, while ensuring that the data is cryptographically safe. This is made possible by an innovative, yet natural generalization for the accepted trusted third party model and a new privacy-preserving data mining algorithm that is suitable for Grid-scale systems. The algorithm is asynchronous, involves no global communication patterns, and dynamically adjusts to changes in the data or to the failure and recovery of resources. To the best of our knowledge, this is the first privacy-preserving mining algorithm to possess these features. Simulations of thousands of resources prove that our algorithm quickly converges to the correct result while using reasonable communication. The simulations also prove that the effect of the privacy parameter on both the convergence time and the number of messages, is logarithmic.
UR - http://www.scopus.com/inward/record.url?scp=4544302774&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:4544302774
SN - 078038430X
SN - 9780780384309
T3 - 2004 IEEE International Symposium on Cluster Computing and the Grid, CCGrid 2004
SP - 411
EP - 418
BT - 2004 IEEE International Symposium on Cluster Computing and the Grid, CCGrid 2004
T2 - 2004 IEEE International Symposium on Cluster Computing and the Grid, CCGrid 2004
Y2 - 19 April 2004 through 22 April 2004
ER -