Word-streams for representing context in word maps
College
College of Computer Studies
Department/Unit
Information Technology
Document Type
Archival Material/Manuscript
Publication Date
2007
Abstract
The most prominent use of Self-Organizing Maps (SOMs) in text archiving and retrieval is the WEBSOM. In WEBSOM, a map is first used to reduce the dimensionality of the huge term frequency table by training a so-called word-category map. This wordcategory map is then used to convert the individual documents into their respective document signatures (i.e. histogram of words) which form the basis for training a document map. This document map is the final text archive. WEBSOM has been shown to be a powerful and versatile text archiving system. However, it spends (wastes) enormous computer resources in the computation of the left and right context of each and every word that appears in any of the documents in the text corpus. This paper presents an alternative scheme for incorporating context in the encoding of the words in such a way that the computation of the probabilistic centroid, which is inherent in the SOM training algorithm, is taken full advantage of. Several experiments are conducted to compare this new scheme with WEBSOM’s context averaging scheme.
html
Recommended Citation
Azcarraga, A. P., Gopez, A. S., & Yap, T. (2007). Word-streams for representing context in word maps. Retrieved from https://animorepository.dlsu.edu.ph/faculty_research/11960
Disciplines
Computer Sciences
Keywords
Context (Linguistics); Self-organizing maps
Upload File
wf_no
Note
Undated; Publication/creation date supplied