Online corpora of Philippine languages
Added Title
DLSU Arts Congress (2009)
College
College of Computer Studies
Department/Unit
Software Technology
Document Type
Conference Proceeding
Source Title
Proceedings of the 2009 DLSU Arts Congress
First Page
153
Last Page
159
Publication Date
2-11-2009
Abstract
Corpora on Philippine languages had been built and made available through an online application. It contains Tagalog, Cebuano, Ilocano, and Hiligaynon texts with 250,000 words each, and seven thousand signs in videos based on the Filipino sign language. Categories of the written texts include creative writing (such as novels and stories) and hortatory or religious texts (such as the Bible). Automated tools are provided for language analysis such as word count, co-occurrences, and others. This is part of a bigger corpora building project for Philippine languages that would consider text, speech and video forms, and the corresponding development of automated tools for language analysis of these various forms.
html
Recommended Citation
Roxas, R. O., Asenjo, G. L., Dita, S. N., Inventado, P. B., Buban, R. S., & Taylan, D. R. (2009). Online corpora of Philippine languages. Proceedings of the 2009 DLSU Arts Congress, 153-159. Retrieved from https://animorepository.dlsu.edu.ph/faculty_research/4109
Disciplines
Databases and Information Systems | South and Southeast Asian Languages and Societies
Keywords
Philippine languages—Databases; Corpora (Linguistics)
Upload File
wf_no