Monitoring dengue using Twitter and deep learning techniques: Its correlation with Department of Health data using infoveillance supply-based methods

Date of Publication

2017

Document Type

Master's Thesis

Degree Name

Master of Science in Computer Science

College

College of Computer Studies

Department/Unit

Computer Science

Thesis Adviser

Charibeth K. Cheng

Defense Panel Chair

Ethel C. Ong

Defense Panel Member

Francis Joseph Campena
Rafael A. Cabredo

Abstract/Summary

According to the World Health Organization, Dengue has become a major concern in tropical countries such as the Philippines. However, the current core health surveillance system in the Philippines utilizes mostly weekly reports from traditional sources such as health stations and clinics which may cause delay in terms of emergency response. Thus, the goal of this research is to develop a system that extracts and monitors dengue-related activity using data from microblogs, specifically Twitter. Nowadays, the use of social media and micro-blogs for sharing information has become extremely common. With this, there is a lot of unstructured data that can be used and analyzed for infoveillance, or the use of electronic information from mediums such as the internet to track diseases. However, recent studies only use traditional classification methods and shallow features from tweets. Thus, the study utilized both semantic and shallow features that can be found from tweets and incorporated these in classifying and analyzing large groups of dengue-related messages through the use of rule-based and deep learning techniques. The study used an annotated corpus of over 5000 tweets for training the model and over 30 million tweets for actual data correlation tests. The final classification model used is an Artificial Neural Network with Gated Recurrent Units which achieved an accuracy score of 94.2772% and hamming loss value of 5.7228% against an annotated corpus of tweets. Moreover, the results of classification were processed in order to compute a dengue tweet index which was calculated by taking the frequency of the union of tweets about absence and tweets about mosquitos. This dengue tweet index achieved a 96.0994% Pearson Correlation with the Department of Health's (DOH) total Philippine dengue morbidity case count per week.

Abstract Format

html

Language

English

Format

Electronic

Accession Number

CDTG007729

Shelf Location

Archives, The Learning Commons, 12F Henry Sy Sr. Hall

Physical Description

1 computer disc ; 4 3/4 in.

Keywords

Health surveys; Health surveys--Statistical methods; Dengue; Microblogs

This document is currently not available here.

Share

COinS