Local L2-thresholding based data mining in peer-to-peer systems

Ran Wolff, Kanishka Bhaduri, Hillol Kargupta

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In a large network of computers, wireless sensors, or mobile devices, each of the components (hence, peers) has some data about the global status of the system. Many of the functions of the system, such as routing decisions, search strategies, data cleansing, and the assignment of mutual trust, depend on the global status. Therefore, it is essential that the system be able to detect, and react to, changes in its global status. Computing global predicates in such systems is usually very costly. Mainly because of their scale, and in some cases (e.g., sensor networks) also because of the high cost of communication. The cost further increases when the data changes rapidly (due to state changes, node failure, etc.) and computation has to follow these changes. In this paper we describe a two step approach for dealing with these costs. First, we describe a highly efficient local algorithm which detect when the L2 norm of the average data surpasses a threshold. Then, we use this algorithm as a feedback loop for the monitoring of complex predicates on the data - such as the data's k-means clustering. The efficiency of the L2 algorithm guarantees that so long as the clustering results represent the data (i.e., the data is stationary) few resources are required. When the data undergoes an epoch change - a change in the underlying distribution - and the model no longer represents it, the feedback loop indicates this and the model is rebuilt. Furthermore, the existence of a feedback loop allows using approximate and "best- effort" methods for constructing the model; if an ill-fit model is built the feedback loop would indicate so, and the model would be rebuilt.

Original languageEnglish
Title of host publicationProceedings of the Sixth SIAM International Conference on Data Mining
PublisherSociety for Industrial and Applied Mathematics
Pages430-441
Number of pages12
ISBN (Print)089871611X, 9780898716115
DOIs
StatePublished - 2006
Externally publishedYes
EventSixth SIAM International Conference on Data Mining - Bethesda, MD, United States
Duration: 20 Apr 200622 Apr 2006

Publication series

NameProceedings of the Sixth SIAM International Conference on Data Mining
Volume2006

Conference

ConferenceSixth SIAM International Conference on Data Mining
Country/TerritoryUnited States
CityBethesda, MD
Period20/04/0622/04/06

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Local L2-thresholding based data mining in peer-to-peer systems'. Together they form a unique fingerprint.

Cite this