Mining for misconfigured machines in grid systems

Noam Palatin, Arie Leizarowitz, Assaf Schuster, Ran Wolff

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Grid systems are proving increasingly useful for managing the batch computing jobs of organizations. One well-known example is Intel, whose internally developed NetBatch system manages tens of thousands of machines. The size, heterogeneity, and complexity of grid systems make them very difficult, however, to configure. This often results in misconfigured machines, which may adversely affect the entire system. We investigate a distributed data mining approach for detection of misconfigured machines. Our Grid Monitoring System (GMS) non-intrusively collects data from all sources (log files, system services, etc.) available throughout the grid system. It converts raw data to semantically meaningful data and stores this data on the machine it was obtained from, limiting incurred overhead and allowing scalability. Afterwards, when analysis is requested, a distributed outliers detection algorithm is employed to identify misconfigured machines. The algorithm itself is implemented as a recursive workflow of grid jobs. It is especially suited to grid systems, in which the machines might be unavailable most of the time and often fail altogether.

Original languageEnglish
Title of host publicationKDD 2006
Subtitle of host publicationProceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery (ACM)
Pages687-692
Number of pages6
ISBN (Print)1595933395, 9781595933393
DOIs
StatePublished - 2006
Externally publishedYes
EventKDD 2006: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Philadelphia, PA, United States
Duration: 20 Aug 200623 Aug 2006

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Volume2006

Conference

ConferenceKDD 2006: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Country/TerritoryUnited States
CityPhiladelphia, PA
Period20/08/0623/08/06

Keywords

  • Distributed Data Mining
  • Grid Information System
  • Grid Systems
  • Outliers Detection
  • System Monitoring

ASJC Scopus subject areas

  • Software
  • Information Systems

Fingerprint

Dive into the research topics of 'Mining for misconfigured machines in grid systems'. Together they form a unique fingerprint.

Cite this