An experimental Tagalog Finite State Automata spellchecker with Levenshtein edit-distance feature
College
College of Computer Studies
Department/Unit
Software Technology
Document Type
Conference Proceeding
Source Title
Proceedings of the 2019 International Conference on Asian Language Processing, IALP 2019
First Page
240
Last Page
243
Publication Date
11-1-2019
Abstract
© 2019 IEEE. In this paper, we present an experimental development of a spell checker for the Tagalog language using a set of word list with 300 random root words and three inflected forms as training data and a two-layered architecture of combined Deterministic Finite Automaton (DFA) with Levenshtein edit-distance. A DFA is used to process strings to identify if it belongs to a certain language via the binary result of accept or reject. The Levenshtein edit-distance of two strings is the number (k) of deletions, alterations, insertions between two sequences of characters. From the sample trained wordlist, results show that a value of 1 for the edit-distance (k) can be effective in spelling Tagalog sentences. Any value greater than 1 can cause suggestion of words even if the spelling of words is correct due to selective and prominent usage of certain characters in the Tagalog language like a, n, g, t, s, l.
html
Digitial Object Identifier (DOI)
10.1109/IALP48816.2019.9037687
Recommended Citation
Imperial, J. R., Ya-On, C. V., & Ureta, J. C. (2019). An experimental Tagalog Finite State Automata spellchecker with Levenshtein edit-distance feature. Proceedings of the 2019 International Conference on Asian Language Processing, IALP 2019, 240-243. https://doi.org/10.1109/IALP48816.2019.9037687
Upload File
wf_yes