Filipino text-to-speech system: Tagapagsalita

Date of Publication

2006

Document Type

Bachelor's Thesis

Degree Name

Bachelor of Science in Computer Science

College

College of Computer Studies

Department/Unit

Computer Science

Thesis Adviser

Joel P. Ilao

Defense Panel Member

Clement Y. Ong

Jocelynn O. Wong-Cu

Russel Lloyd C. Lim

Abstract/Summary

Although computers can be used to speak like humans, it is more likely to sound artificial or synthetic. Such a task is normally performed by a Text-to-Speech (TTS) system. Few studies have been conducted to implement TTS systems in Tagalog. In this research a TTS system specifically designed for the Tagalog number words Isa to Isandaan was developed. This TTS system works in three major stages. Diphones present in the words Isa to Isandaan were first recorded, cut and denoised using a third party program specialising in audio processing. The pre-processed signals were compressed using Linear Predictive Coding the signals were passed to a reversible filter which extracts LPC Coeffecients, per frame gains and excitation. Finally, these parameters were taken and reversed to produced a synthethic version of the original diphones. Through the use of the Synchronous Overlap-Add (SOLA) technique, reconstructed diphones were concatenated into whole words. Based on its purpose, testing of the system was rated by intelligibility. Thirty-one persons were requested to articulation and speed with the score of 1 being the lowest and 5 being the highest score. Mean opinion score of 30 persons scored an average of 4.30 for listening effort, 4.27 for syllabication, 4.16 for stress, 4.18 for articulation, 4.07 for speed in all significant words for male and 4.25 for listening effort, 4.29 for syllabication, 4.16 for stress, 4.18 for articulation and 4.14 for speed in all significant words for female. Discrepancies of the speech intelligibility and quality are much attributed to the preprocessing phase of the speech signal and also to the subjective perception of the respondent listener based upon the prosodic parameters like pitch, duration and amplitude as seen from the result of the MOS of the synthetic uttered tagalog word Isandaan . Linear Predictive Coding technique is a useful tool for compression, since it can extract information for the synthesis of speech without affecting the intelligibility of the speech.

Abstract Format

html

Language

English

Format

Print

Accession Number

TU13496

Shelf Location

Archives, The Learning Commons, 12F, Henry Sy Sr. Hall

Physical Description

1 v. (various foliations) : ill. ; 28 cm.

Keywords

Filipino language; Technological innovations

This document is currently not available here.

Share

COinS