In many emerging applications, the data which has to be monitored is of very high volume, dynamic, and distributed, making it infeasible to collect the distinct data streams to a central node and process them there. Often, the monitoring problem consists of determining whether the value of a global function, which depends on the union of all streams, crossed a certain threshold. A great deal of effort is directed at reducing communication overhead by transforming the monitoring of the global function to the testing of local constraints, checked independently at the nodes. Recently, geometric monitoring (GM) proved to be very useful for constructing such local constraints for general (non-linear, non-monotonic) functions. Alas, in all current variants of geometric monitoring, the constraints at all nodes share an identical structure and are, thus, unsuitable for handling heterogeneous streams, which obey different distributions at the distinct nodes. To remedy this, we propose a general approach for geometric monitoring of heterogeneous streams (HGM), which defines constraints tailored to fit the distinct data distributions at the nodes. While optimally selecting the constraints is an NP-hard problem, we provide a practical solution, which seeks to reduce running time by hierarchically clustering nodes with similar data distributions and then solving more, but simpler, optimization problems. Experiments are provided to support the validity of the proposed approach.
|Number of pages||6|
|Journal||CEUR Workshop Proceedings|
|State||Published - 2013|
|Event||1st International Workshop on Big Dynamic Distributed Data, BD3 2013 - Co-located with International Conference on Very Large Databases, VLDB 2013 - Riva del Garda, Italy|
Duration: 30 Aug 2013 → 30 Aug 2013
ASJC Scopus subject areas
- Computer Science (all)