Supervised Learning for Automatic Classification of Documents using Self-Organizing Maps

Dina Goren-Bar, Tsvi Kuflik, Dror Lev

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Automatic Document Classification that corresponds with user-predefined classes is a challenging and widely
researched area. Self-Organizing Maps (SOM) are unsupervised Artificial Neural Networks (ANN) which are
mathematically characterized by transforming high-dimensional data into two-dimension representation,
enabling automatic clustering of the input, while preserving higher order topology. A closely related algorithm
is the Learning Vector Quantization (LVQ), which uses supervised learning to maximize correct data
classification. This study presents the application of SOM and LVQ to automatic document classification, based
on predefined set of clusters. A set of documents, manually clustered by domain expert was used. Experimental
results show considerable success of automatic document clustering that matches manual clustering, with a slight
preference for the LVQ.
Original languageEnglish
Title of host publicationProceedings of the First DELOS Network of Excellence Workshop on Information Seeking, Searching and Querying in Digital Libraries (DELOS 2000)
PublisherERCIM
Pages143-146
StatePublished - 2000
Externally publishedYes

Publication series

NameERCIM Workshop Proceedings
PublisherERCIM

Fingerprint

Dive into the research topics of 'Supervised Learning for Automatic Classification of Documents using Self-Organizing Maps'. Together they form a unique fingerprint.

Cite this