Measuring the Measuring Tools: An Automatic Evaluation of Semantic Metrics for Text Corpora

George Kour, Samuel Ackerman, Orna Raz, Eitan Farchi, Boaz Carmeli, Ateret Anaby-Tavor

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The ability to compare the semantic similarity between text corpora is important in a variety of natural language processing applications. However, standard methods for evaluating these metrics have yet to be established. We propose a set of automatic and interpretable measures for assessing the characteristics of corpus-level semantic similarity metrics, allowing sensible comparison of their behavior. We demonstrate the effectiveness of our evaluation measures in capturing fundamental characteristics by evaluating them on a collection of classical and state-of-the-art metrics. Our measures revealed that recently-developed metrics are becoming better in identifying semantic distributional mismatch while classical metrics are more sensitive to perturbations in the surface text levels.

Original languageEnglish
Title of host publicationGEM 2022 - 2nd Workshop on Natural Language Generation, Evaluation, and Metrics, Proceedings of the Workshop
PublisherAssociation for Computational Linguistics (ACL)
Pages405-416
Number of pages12
ISBN (Electronic)9781959429128
StatePublished - 2022
Externally publishedYes
Event2nd Workshop on Natural Language Generation, Evaluation, and Metrics, GEM 2022, as part of EMNLP 2022 - Abu Dhabi, United Arab Emirates
Duration: 7 Dec 2022 → …

Publication series

NameGEM 2022 - 2nd Workshop on Natural Language Generation, Evaluation, and Metrics, Proceedings of the Workshop

Conference

Conference2nd Workshop on Natural Language Generation, Evaluation, and Metrics, GEM 2022, as part of EMNLP 2022
Country/TerritoryUnited Arab Emirates
CityAbu Dhabi
Period7/12/22 → …

Bibliographical note

Publisher Copyright:
© 2022 Association for Computational Linguistics.

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'Measuring the Measuring Tools: An Automatic Evaluation of Semantic Metrics for Text Corpora'. Together they form a unique fingerprint.

Cite this