Date of Publication

2023

Document Type

Dissertation/Thesis

Degree Name

Master of Science in Computer Science

College

College of Computer Studies

Department/Unit

Software Technology

Thesis Advisor

Ronald Pascual

Defense Panel Chair

Judith Azcarraga

Defense Panel Member

Ann Franchesca Laguna
Ronald Pascual

Abstract (English)

Although there have been previous studies on Filipino ASR, it is primarily focused on the Hidden Markov Model (HMM) with the Gaussian Mixture Model (GMM) approach. Studies on Bisaya ASR are much more limited in terms of resources such as speech corpus and previous works. There is a lack of neural network or end-to-end system studies because of this since neural networks require massive amounts of data to train. An alternative to this would be the hybrid model which makes use of both neural networks and HMM. This neural network architecture would still need data but not as much as an end-to-end ASR system. To address these opportunities, this study makes use of De La Salle University’s healthcare chatbot project speech corpus for the Filipino and Bisaya languages. Furthermore, this study collected, preprocessed, as well as transcribed additional Filipino speech data. With these data, the study also presented an HMM-GMM ASR system similar to previous studies as a baseline. This study also experimented with phoneme sets, n-grams, language model weights, HMM states, and model enhancement techniques. The study found that the best models for both Filipino and Bisaya used SAT with a 3.96% WER AND 5.41% WER respectively. The study also developed a deep neural network (DNN) HMM baseline model and time delay neural network (TDNN) HMM models with symmetric, asymmetric, and subsampled time strides. For Filipino, the best model is the asymmetric TDNN-HMM model with a 3.48% WER. For Bisaya, the best model is the baseline DNN-HMM model with a 5.50% WER. Furthermore, the study also explored numerous experiments which are: 1) the effects of additional data with respect to performance, 2) the performance of the models on actual conversational children’s speech, and 3) the performance of using cross-language acoustic models.

Abstract Format

html

Language

English

Recommended Citation

Ing, J. (2023). Filipino and Bisaya ASR System Using TDNN-HMM Towards Application in a Healthcare Chatbot. Retrieved from https://animorepository.dlsu.edu.ph/etdm_softtech/9

Upload Full Text

wf_yes

2023_Ing_PreliminaryPages.pdf (102 kB)
2023_Ing_PageswithSignature.pdf (892 kB)
2023_Ing_Chapter1.pdf (69 kB)
2023_Ing_Chapter2.pdf (115 kB)
2023_Ing_Chapter3.pdf (235 kB)
2023_Ing_Chapter4.pdf (102 kB)
2023_Ing_Chapter5.pdf (360 kB)
2023_Ing_Chapter6.pdf (71 kB)
2023_Ing_References.pdf (84 kB)

Embargo Period

8-10-2023

Download

COinS

Software Technology Master's Theses

Filipino and Bisaya ASR System Using TDNN-HMM Towards Application in a Healthcare Chatbot

Date of Publication

Document Type

Degree Name

College

Department/Unit

Thesis Advisor

Defense Panel Chair

Defense Panel Member

Abstract (English)

Abstract Format

Language

Recommended Citation

Upload Full Text

Embargo Period

Search

Browse

Submit

Connect

Software Technology Master's Theses

Filipino and Bisaya ASR System Using TDNN-HMM Towards Application in a Healthcare Chatbot

Author

Date of Publication

Document Type

Degree Name

College

Department/Unit

Thesis Advisor

Defense Panel Chair

Defense Panel Member

Abstract (English)

Abstract Format

Language

Recommended Citation

Upload Full Text

Embargo Period

Share

Search

Browse

Submit

Connect