Filipino text-to-speech system: Tagapagsalita
Date of Publication
2006
Document Type
Bachelor's Thesis
Degree Name
Bachelor of Science in Computer Science
College
College of Computer Studies
Department/Unit
Computer Science
Thesis Adviser
Joel P. Ilao
Defense Panel Member
Clement Y. Ong
Jocelynn O. Wong-Cu
Russel Lloyd C. Lim
Abstract/Summary
Although computers can be used to speak like humans, it is more likely to sound artificial or synthetic. Such a task is normally performed by a Text-to-Speech (TTS) system. Few studies have been conducted to implement TTS systems in Tagalog. In this research a TTS system specifically designed for the Tagalog number words Isa to Isandaan was developed. This TTS system works in three major stages. Diphones present in the words Isa to Isandaan were first recorded, cut and denoised using a third party program specialising in audio processing. The pre-processed signals were compressed using Linear Predictive Coding the signals were passed to a reversible filter which extracts LPC Coeffecients, per frame gains and excitation. Finally, these parameters were taken and reversed to produced a synthethic version of the original diphones. Through the use of the Synchronous Overlap-Add (SOLA) technique, reconstructed diphones were concatenated into whole words. Based on its purpose, testing of the system was rated by intelligibility. Thirty-one persons were requested to articulation and speed with the score of 1 being the lowest and 5 being the highest score. Mean opinion score of 30 persons scored an average of 4.30 for listening effort, 4.27 for syllabication, 4.16 for stress, 4.18 for articulation, 4.07 for speed in all significant words for male and 4.25 for listening effort, 4.29 for syllabication, 4.16 for stress, 4.18 for articulation and 4.14 for speed in all significant words for female. Discrepancies of the speech intelligibility and quality are much attributed to the preprocessing phase of the speech signal and also to the subjective perception of the respondent listener based upon the prosodic parameters like pitch, duration and amplitude as seen from the result of the MOS of the synthetic uttered tagalog word Isandaan . Linear Predictive Coding technique is a useful tool for compression, since it can extract information for the synthesis of speech without affecting the intelligibility of the speech.
Abstract Format
html
Language
English
Format
Accession Number
TU13496
Shelf Location
Archives, The Learning Commons, 12F, Henry Sy Sr. Hall
Physical Description
1 v. (various foliations) : ill. ; 28 cm.
Keywords
Filipino language; Technological innovations
Recommended Citation
Aralar, K. A., Coloso, P. H., & Moneda, J. R. (2006). Filipino text-to-speech system: Tagapagsalita. Retrieved from https://animorepository.dlsu.edu.ph/etd_bachelors/7656