Text translation: Template extraction for a bidiretional english-filipino example-based machine translation
Date of Publication
2006
Document Type
Bachelor's Thesis
Degree Name
Bachelor of Science in Computer Science
Subject Categories
Computer Sciences
College
College of Computer Studies
Department/Unit
Computer Science
Thesis Adviser
Ethel C. Ong
Defense Panel Member
Ethel Ong
Allan Borra
Rachel Roxas
Abstract/Summary
A bidirectional English-Filipino Example-based Machine Translation System that learns and uses templates is presented. The system uses machine learning techniques to initially extract templates from a given bilingual corpus. These templates are subsequently used for translating English input text into Filipino and vise versa. The system implements the similarity template learning algorithm performed by (Cicekli et. al, 2001) but goes further by introducing template refinement and derivation of templates from chunks learned. To improve translation quality, new chunk alignment and splitting algorithms are introduced into the training process while a flexible template and chunk matching scheme is establish for translation. Test results verify that a strict chunk alignment scheme in training is needed and that specific words such as commonly occurring words need to be filtered out to produce better templates, thereby improving overall quality by assuring complete template and chunk correctness in training and reducing word and sentence error rates by as much as half in translation. Tests also show that the translation with the highest score selected from various candidates is consistently the best choice as checked against automotive evaluation methods. Still, much of the system implementation is limited by the quality and coverage of the lexicon and morphological references which are patterned after those of TWiRL's a rule-based machine translator. This research is part of a three-year project on hybrid machine translation that is funded by the Philippine Council for Advanced Science and Technology Research and Development of the Department of Science and Technology (DOST-PCASTRD).
Abstract Format
html
Language
English
Format
Accession Number
TU14567
Shelf Location
Archives, The Learning Commons, 12F, Henry Sy Sr. Hall
Recommended Citation
Go, K. L., Morga, M. R., Nunez, V. D., & Veto, F. S. (2006). Text translation: Template extraction for a bidiretional english-filipino example-based machine translation. Retrieved from https://animorepository.dlsu.edu.ph/etd_bachelors/14396