Modeling personality traits of Filipino twitter users based on linguistic markers

Date of Publication

2017

Document Type

Master's Thesis

Degree Name

Master of Science in Computer Science

College

College of Computer Studies

Department/Unit

Computer Science

Thesis Adviser

Charibeth K. Cheng

Defense Panel Chair

Remedios De dios Bulos
Rafael A. Cabredo

Defense Panel Member

Courtney Anne M. Ngo
Merlin Teodosia C. Suarez

Abstract/Summary

There have been multiple studies that correlate a persons writing style and personality traits. With the power of machine learning, this eventually led to the rise of computational text-based personality trait recognition. The eld is constantly growing as it started from analyzing personal essays and is currently exploring the enormous amount of data available from social networking sites such as Facebook or Twitter. Current studies have shifted from analyzing English to analyzing non-English languages; however, the eld still lacks in three areas: (1) analysis of the Filipino Language, (2) analysis of Filipinos, or a group of individuals, word choice, and (3) analysis of the output of feature reduction techniques. This research has addressed each of these concerns by collecting and processing the Tweets of 288 Filipino Twitter users. A language independent approach was implemented to handle the multiple languages that could be spoken by individuals. Computational model were then created for each of the personality traits of the Five Factor Model. Findings show that Conscientiousness is the easiest trait to model (F1 = 0.8251; = 0.6499), while the model for Openness is the hardest (F1 = 0.6194; = 0.2414). Analysis also showed that 1-grams are sucient to model traits for all of the Big Five, except for Extraversion that utilized 1, 2, and 3-grams. This research also analyzed feature-reduced datasets used by each traits top performing models to identify the composition of the set of features. Findings show that there are 11 LIWC2015 categories that are common amongst all of the Big Five such as Active Processes, Positive Emotion, and Informal Language.

Abstract Format

html

Language

English

Format

Electronic

Accession Number

CDTG007241

Shelf Location

Archives, The Learning Commons, 12F Henry Sy Sr. Hall

Physical Description

1 computer disc ; 4 3/4 inches

Keywords

Natural language processing (Computer science); Information filtering systems; Machine learning; Personality; Online social networks

This document is currently not available here.

Share

COinS