Building a subjectivity lexicon for Filipino

Date of Publication

5-3-2013

Document Type

Master's Thesis

Degree Name

Master in Computer Science

Subject Categories

Computer Sciences

College

College of Computer Studies

Department/Unit

Computer Science

Thesis Adviser

Charibeth K. Cheng

Defense Panel Chair

Allan B. Borra

Defense Panel Member

Ethel C. Ong
Charibeth K. Cheng

Abstract/Summary

Textual information all domain can be categorized into two namely, facts (objective information) and opinion (subjective information). Facts contain objective information about an entity as well as its attributes and properties. Opinions, on the other hand, are subjective information which contains description of people’s emotion towards an entity as well as its attributes and properties. Though there are existing lexicons that would assist sentiment analysis, this lexicons are not solely built to support analysis for the Filipino language. Thus this research was aimed to address the problem of limited resources for sentiment analysis in Filipino by building a subjectivity lexicon extracted from Filipino opinion articles. A machine-learning based approach for lexicon expansion was adapted in this research. A classifier was built through Weka using three different machine learning algorithms, namely, C4.5 (decision tree), Naïve Bayes, and k-Nearest Neighbor; and was evaluated using 10-cross fold validation. Results show that k- Nearest Neighbor gave the best result, with 92.20% and 99.42% for window 5 and window 3 respectively.

Abstract Format

html

Language

English

Format

Electronic

Accession Number

CDTG005357

Shelf Location

Archives, The Learning Commons, 12F Henry Sy, Sr. Hall

Keywords

Sentiment analysis; Subjectivity (Linguistics); Filipino language—Semantics

Upload Full Text

wf_no

Embargo Period

7-3-2023

This document is currently not available here.

Share

COinS