Distributed decision-tree induction in peer-to-peer systems

Kanishka Bhaduri, Ran Wolff, Chris Giannella, Hillol Kargupta

Research output: Contribution to journalArticlepeer-review

Abstract

This paper offers a scalable and robust distributed algorithm for decision-tree induction in large peer-to-peer (P2P) environments. Computing a decision tree in such large distributed systems using standard centralized algorithms can be very communication-expensive and impractical because of the synchronization requirements. The problem becomes even more challenging in the distributed stream monitoring scenario where the decision tree needs to be updated in response to changes in the data distribution. This paper presents an alternate solution that works in a completely asynchronous manner in distributed environments and offers low communication overhead, a necessity for scalability. It also seamlessly handles changes in data and peer failures. The paper presents extensive experimental results to corroborate the theoretical claims.

Original languageEnglish
Pages (from-to)85-103
Number of pages19
JournalStatistical Analysis and Data Mining
Volume1
Issue number2
DOIs
StatePublished - Jun 2008

Keywords

  • Data mining
  • Decision trees
  • Peer-to-peer

ASJC Scopus subject areas

  • Analysis
  • Information Systems
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Distributed decision-tree induction in peer-to-peer systems'. Together they form a unique fingerprint.

Cite this