Date of Publication
12-2005
Document Type
Master's Thesis
Degree Name
Master of Science in Computer Science
Subject Categories
Computer Sciences
College
College of Computer Studies
Department/Unit
Computer Science
Thesis Adviser
Rachel Editha O. Roxas
Defense Panel Chair
Allan B. Borra
Defense Panel Member
Charibeth K. Cheng
Ethel C. Ong
Abstract/Summary
Selecting the right word translation among several options in the lexicon is a core problem for machine translation. It is not enough that a word in context is translated, but an appropriate translation must be considered. An automated approach is presented here for resolving target word selection, based on word-to-sense and sense-to-word relationship between source words and its translations, utilizing syntactic relationships (subject-verb, verb-object, adjective noun). Translation selection proceeds from sense disambiguation of source words based on knowledge from a bilingual dictionary and word similarity measures from WordNet, and then selection of target a word using statistics from a target language corpus. The system was tested on 145,746 word pairs in syntactic relationships that were extracted from target corpora gathered from various online editorials, Tagalog readings and Tagalog New Testament with a total of 317,113 words. Sense profile, with 2681 entries for source words was built from an existing bilingual dictionary that includes clues for disambiguation and target translations. A test on 200 sentences with ambiguous words (average of 4 senses) in three categories: nouns, verbs and adjectives, produced an overall result of 63.89% accuracy for selecting word translation with a standardized precision of at least 80% for generating expected translations for different categories: nouns, verbs, adjectives. An addition of reliable clues for sense disambiguation, as well as application of some smoothing techniques can further improve overall performance of the method. The words produced by the system are root words. The system can further be improved with the integration of morphological generation into a machine translation system to produce even more fluent translations. In addition, the method developed in here can be extended to accommodate translation of other content words as well as other syntactic categories. Furthermore, the method presented here can be improved to support bidirectional translation (Tagalog to English).
Abstract Format
html
Language
English
Format
Electronic
Accession Number
CDTG003938
Shelf Location
Archives, The Learning Commons, 12F Henry Sy Sr. Hall
Physical Description
xi, 268 leaves, 1 computer optical disc ; 4 3/4 in.
Keywords
Machine translating; Information theory
Upload Full Text
wf_yes
Recommended Citation
Domingo, E. C. (2005). Automatic resolution of target word ambiguity. Retrieved from https://animorepository.dlsu.edu.ph/etd_masteral/3306