Sentence-level morphological and phonological analyzer for Filipino (filSPAM)
Date of Publication
2011
Document Type
Bachelor's Thesis
Degree Name
Bachelor of Science in Computer Science
Subject Categories
Computer Sciences
College
College of Computer Studies
Department/Unit
Computer Science
Thesis Adviser
Shirley,Chu
Defense Panel Member
Allan Borra
Nathalie Rose Lim-Cheng
Abstract/Summary
Morphological analysis is an important process in natural language processing. It deals with the identification of a root word and its affixes (morphemes) from a morphed word. Phonology is another facet of morphology that has to do with how a word is voiced or sounded out. There are various approaches and systems that exist and are used in morphological analysis for generating rules for different languages such as MACTag. These differ in each of their methods in identification and classification of morphemes as well as handling ambiguity. Although there are systems which handle morphology for Filipino, most of these are limited in that they are only word-level and they do not cover rules for phonology. Part-of-Speech tagging is an integrated part in sentence analysis that is concerned with annotating the part-of-speech of a particular word in a sentence. There are existing tools for part-of-speech tagging such as HATPOST. These components, namely the morphological analyzer and part-of-speech tagger, function independently from one another. However, they have their own individual limitations that need to be addressed. The research constructs a sentence-level morphological and phonological analyzer for the Filipino language that integrate the aforementioned components in order to identify the part-of-speech of a Filipino word in the sentence and generate the root word and phonology of the identified words. filSPAM (Sentence-level Phonological and Morphological Analyzer for Filipino) analyzes a given Filipino sentence input and generate the corresponding part-of-speech, root word, and phonology of this sentence. The system has four modules: POS tagger which has 54% accuracy, the morphological analyzer which has 73.02% accuracy, the phonological analyzer is corpus-based and unknown handler which has two functions, the automaton and the generalized tree which has 67% accuracy and 64% respectively.
Abstract Format
html
Language
English
Format
Accession Number
TU18447
Shelf Location
Archives, The Learning Commons, 12F, Henry Sy Sr. Hall
Physical Description
ix, 40, 15 leaves : illustrations ; 28 cm.
Keywords
Grammar, Comparative and general--Morphology; Natural language processing (Computer science)
Recommended Citation
Alina, A. C., Cambaliza, C. R., Sosa, J. F., & Sta. Ana, X. J. (2011). Sentence-level morphological and phonological analyzer for Filipino (filSPAM). Retrieved from https://animorepository.dlsu.edu.ph/etd_bachelors/11167