Ontologies conceptualize domains and are a crucial part of web semantics and information systems. However, re-using an existing ontology for a new task requires a detailed evaluation of the candidate ontology as it may cover only a subset of the domain concepts, contain information that is redundant or misleading, and have inaccurate relations and hierarchies between concepts. Manual evaluation of large and complex ontologies is a tedious task. Thus, a few approaches have been proposed for automated evaluation, ranging from concept coverage to ontology generation from a corpus. Existing approaches, however, are limited by their dependence on external structured knowledge sources, such as a thesaurus, as well as by their inability to evaluate semantic relationships. In this paper, we propose a novel framework to automatically evaluate the domain coverage and semantic correctness of existing ontologies based on domain information derived from text. The approach uses a domain-tuned named-entity-recognition model to extract phrasal concepts. The extracted concepts are then used as a representation of the domain against which we evaluate the candidate ontology's concepts. We further employ a domain-tuned language model to determine the semantic correctness of the candidate ontology's relations. We demonstrate our automated approach on several large ontologies from the oceanographic domain and show its agreement with a manual evaluation by domain experts and its superiority over the state-of-the-art.
|Title of host publication||ACM Web Conference 2023 - Companion of the World Wide Web Conference, WWW 2023|
|Publisher||Association for Computing Machinery, Inc|
|Number of pages||11|
|State||Published - 30 Apr 2023|
|Event||2023 World Wide Web Conference, WWW 2023 - Austin, United States|
Duration: 30 Apr 2023 → 4 May 2023
|Name||Companion Proceedings of the ACM Web Conference 2023|
|Conference||2023 World Wide Web Conference, WWW 2023|
|Period||30/04/23 → 4/05/23|
Bibliographical noteFunding Information:
This work was partially supported by the Data Science Research Center at the University of Haifa through the Israel PBC grant Advancing Data Science to Serve Humanity and Protect the Global Environment (grant no. 100009443), the Danish Council for Independent Research (DFF) under grant agreement no. DFF-8048-00051B, and the Poul Due Jensen Foundation.
© 2023 Owner/Author.
- knowledge engineering
- natural language processing
ASJC Scopus subject areas
- Computer Networks and Communications