Date of Publication

2021

Document Type

Master's Thesis

Degree Name

Master of Science in Computer Science

Subject Categories

Computer Sciences | Morphology

College

College of Computer Studies

Department/Unit

Computer Science

Thesis Advisor

Charibeth K. Cheng

Defense Panel Chair

Nathalie Rose Lim-Cheng

Defense Panel Member

Edward Tighe
Charibeth K. Cheng

Abstract/Summary

This paper presents a hybrid-approach on Filipino Morphological Analysis by combining root word extraction and a rule-based grammatical information extrac- tion. In the hopes of the machine to learn and understand the different Filipino Morphological phenomena, this work introduced and compared multiple neural network models and its variants: Feedforward Neural Networks (FFNN), Recur- rent Neural Networks (RNN), and BERT, for extracting the root word of any given Tagalog word, the performance of each models in this work was measured with accuracy and Levenshtein distance metrics and have seen that the BERT model was the best performing model on both the default test dataset (87.45%) and the UD-Treebanks dataset (79.11%). It was further compared with the MAGTaga- log Morphological analyser, which only performed 51.26% in the UD-Treebanks dataset. Although, it could be noticed that the BERT model had the biggest memory requirement and time-to-train making it slightly inefficient in terms of rapid development. Further problems regarding the models’ performance include suffixation, particularly suffixes ending ’g’ and ’ng’. It is noted that the mentioned suffixes were not also part of the official Tagalog morphological rules. It can be solved on future iterations of this work. All the proposed solutions performed very well in identifyin the different Filipino morphological phenomena and this work can be used on other NLP tasks with its API design.

Keywords: Morphological Analysis

Abstract Format

html

Language

English

Format

Electronic

Physical Description

97 leaves

Keywords

Morphology; Extraction (Linguistics)

Upload Full Text

wf_yes

Embargo Period

6-8-2022

Share

COinS