Communication Efficient Algorithms for Bounding and Approximating the Empirical Entropy in Distributed Systems

Amit Shahar, Yuval Alfassi, Daniel Keren

Research output: Contribution to journalArticlepeer-review

Abstract

The empirical entropy is a key statistical measure of data frequency vectors, enabling one to estimate how diverse the data are. From the computational point of view, it is important to quickly compute, approximate, or bound the entropy. In a distributed system, the representative (“global”) frequency vector is the average of the “local” frequency vectors, each residing in a distinct node. Typically, the trivial solution of aggregating the local vectors and computing their average incurs a huge communication overhead. Hence, the challenge is to approximate, or bound, the entropy of the global vector, while reducing communication overhead. In this paper, we develop algorithms which achieve this goal.

Original languageEnglish
Article number1611
JournalEntropy
Volume24
Issue number11
DOIs
StatePublished - 5 Nov 2022

Bibliographical note

Publisher Copyright:
© 2022 by the authors.

Keywords

  • distributed systems
  • entropy
  • entropy approximation
  • entropy bounds
  • sketches

ASJC Scopus subject areas

  • Information Systems
  • Mathematical Physics
  • Physics and Astronomy (miscellaneous)
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Communication Efficient Algorithms for Bounding and Approximating the Empirical Entropy in Distributed Systems'. Together they form a unique fingerprint.

Cite this