WordFlag: Flagging inappropriate words in recorded speech through word spotting
Date of Publication
2009
Document Type
Bachelor's Thesis
Degree Name
Bachelor of Science in Computer Science
Subject Categories
Computer Sciences
College
College of Computer Studies
Department/Unit
Computer Science
Thesis Adviser
Jocelyn Cu
Defense Panel Member
Rafael Cabredo
Karlo Campos
Abstract/Summary
Speech analytics is one of the most important methods used by the call center and telecommunications field in analyzing call content to improve customer satisfaction and overall business performance. One of the technologies used in speech analytics is word spotting, which is the identification of specific words in speech.
Current speech analytics systems are usually focused on providing solutions for business analysis and product-related issues based on customer speech or feedback. Most Automatic Speech Recognition (ASR) systems that make use of the word spotting technology have specific methods to disregard the speaker's other words, which are considered insignificant in analyzing calls. However, with the large number of agents needed to be hired by companies and the tight competition of the Philippines with other countries in the worldwide contact or call center industry, there is a need to aid agent training processes and issues, including how other unnecessary and inappropriate words affect the customer-agent interaction.
This research focuses on the design and development of an isolated word spotting system that automatically flags inappropriate words in a speech recording to aid the call center agent training process. WordFlag is trained to flag 65 words from a predefined list and makes use of recordings from different speakers as it is also designed to be speaker-independent. The system incorporates preprocessing through noise reduction, modified isolated word endpoint detection and segmentation, MFCC feature extraction, modified Hidden Markov Models, and word-based recognition. The WordFlag's system test results show an overall recognition average of 41.25%. It was also observed that words which were trained with additional semantic variations show a higher recognition rate at 48.3% than those without variations, which had a lower rate at 31.2%. Improvements on the corpus data and application of phoneme-based recognition may be done for future projects to compare performance of similar systems.
Abstract Format
html
Language
English
Format
Accession Number
TU19880
Shelf Location
Archives, The Learning Commons, 12F, Henry Sy Sr. Hall
Physical Description
1 v. (various foliations) ; 28 cm. + one computer optical disc.
Keywords
Speech processing systems
Recommended Citation
Otic, A. N., Ramos, F. S., Torres, R. O., & Yson, K. R. (2009). WordFlag: Flagging inappropriate words in recorded speech through word spotting. Retrieved from https://animorepository.dlsu.edu.ph/etd_bachelors/11379