A spell checker for a low-resourced and morphologically rich language
College
College of Computer Studies
Department/Unit
Software Technology
Document Type
Conference Proceeding
Source Title
IEEE Region 10 Annual International Conference, Proceedings/TENCON
Volume
2017-December
First Page
1853
Last Page
1856
Publication Date
12-19-2017
Abstract
Spell checking plays an important role in improving the quality of documents by identifying misspelled words in the document. There are various efforts made towards advancement of spell checkers on other languages such as in English that has almost perfected spell checking system (e.g. Microsoft Word). However, few efforts were made to develop an efficient Filipino spell checker. One major challenge of existing Filipino spell checkers, being dictionary-based, is the lack of a complete dictionary to capture all inflected forms (e.g. isinasama 'including', isasama 'will be included', and isinama 'included' with the base form sama 'include'), borrowing (e.g. magtex 'to text' and nagtex 'texted'), and code-switching (e.g. magtext 'to text', and nag-text 'texted' with the base form 'text') of a word. In addition, existing systems cannot handle code-switching wherein valid words are being marked as erroneous. In this research, a spell checking is designed for Filipino-low-resourced morphologically rich language. It detects and corrects typographical errors in the language and introduces a modified version of metaphone algorithm for ranking the candidate suggestions. The system results to 81% recall, 53.64% precision, 64.53% f-measure, and 87.78% suggestion adequacy on 100 sentences taken from exercise documents of Filipino students. © 2017 IEEE.
html
Digitial Object Identifier (DOI)
10.1109/TENCON.2017.8228160
Recommended Citation
Octaviano, M., & Borra, A. (2017). A spell checker for a low-resourced and morphologically rich language. IEEE Region 10 Annual International Conference, Proceedings/TENCON, 2017-December, 1853-1856. https://doi.org/10.1109/TENCON.2017.8228160
Disciplines
Computer Sciences
Keywords
Filipino language--Orthography and spelling--Data processing; Spelling errors
Upload File
wf_yes