Vox pop: Automated opinion detection and classification with data clustering

Date of Publication

2010

Document Type

Bachelor's Thesis

Degree Name

Bachelor of Science in Computer Science Major in Instructional Systems Technology

College

College of Computer Studies

Department/Unit

Information Technology

Thesis Adviser

Charibeth Cheng

Defense Panel Chair

Ethel Ong

Defense Panel Member

Rachel Edita Roxas
Allan Borra

Abstract/Summary

A large amount of opinions, such as those found in blogs, forums and product reviews, are being uploaded daily as internet technology is progressing. However, these data bring more inconvenience than benefits due to its lack or organization. It is also difficult to find and underutilized. With the use of Natural Language Processing, it is possible to organize these data making it useful to aid in decision or policy making. This paper will focus on the development of a system that uses text processing techniques in organizing the sentiments of public commentaries.

Current systems are able to differentiate facts from opinions, as well as classify these opinions based on their polarity. Clustering has also been done based on the words used. The system Vox Pop performs there three functions, namely, opinion detection, polarity classification and clustering using a rule-based approach. Opinions are classified by computing for polarity using scores produced by SentiWordNet. Commentaries are clustered by computing for the Euclidean Distance of each word. SentiWordNet, MontlyTagger and K-Means clustering for the Euclidean Distance of each word. SentiWordNet, MontlyTagger and K-Means clustering algorithm are some of the resources and tools used by the system. Expert and non-expert evaluations were done in order to test the system. The detection, classification and clustering modules have accuracy rates of 50.5% and 53.85% respectively.

Abstract Format

html

Language

English

Format

Print

Accession Number

TU18472

Shelf Location

Archives, The Learning Commons, 12F, Henry Sy Sr. Hall

Physical Description

vii, 109, 13, 31 leaves : illustrations (some colored) ; 28 cm.

Keywords

Cluster analysis; Cluster analysis--Data processing.

This document is currently not available here.

Share

COinS