Online corpora of Philippine languages

Added Title

DLSU Arts Congress (2009)

College

College of Computer Studies

Department/Unit

Software Technology

Document Type

Conference Proceeding

Source Title

Proceedings of the 2009 DLSU Arts Congress

First Page

153

Last Page

159

Publication Date

2-11-2009

Abstract

Corpora on Philippine languages had been built and made available through an online application. It contains Tagalog, Cebuano, Ilocano, and Hiligaynon texts with 250,000 words each, and seven thousand signs in videos based on the Filipino sign language. Categories of the written texts include creative writing (such as novels and stories) and hortatory or religious texts (such as the Bible). Automated tools are provided for language analysis such as word count, co-occurrences, and others. This is part of a bigger corpora building project for Philippine languages that would consider text, speech and video forms, and the corresponding development of automated tools for language analysis of these various forms.

html

Disciplines

Databases and Information Systems | South and Southeast Asian Languages and Societies

Keywords

Philippine languages—Databases; Corpora (Linguistics)

Upload File

wf_no

This document is currently not available here.

Share

COinS