Speech to text converter for Filipino language using hybrid artificial neural network/Hidden Markov Model

Date of Publication

2007

Document Type

Bachelor's Thesis

Degree Name

Bachelor of Science in Electronics and Communications Engineering

College

Gokongwei College of Engineering

Department/Unit

Electronics and Communications Engineering

Thesis Adviser

Edwin Sybingco

Enrique M. Manzano

Elmer P. Dadios

Roberto T. Caguingin

Abstract/Summary

The Filipino language is a simple yet at the same time a complex language with its semantics and grammar syntax relatively easy to learn for a person. However for a machine or computer to learn this kind of capacity for language recognition require a moderately complex system. The basis for this thesis project stems from the need of convenience for the handicapped people interacting with computer and machines alike. The rapid change in the development and evolution of speech recognition systems make this endeavor a significant step for the Filipino language industry. The thesis aims to make a speech recognition system which utilizes speech processing techniques to evaluate certain words spoken in Filipino. The group first employs feature extraction as the front end process of the speech recognition system then experiments with different algorithm techniques to for optimization by using either a feed-forward back propagation algorithm or SOM networks to train the samples for the neural networks. The samples obtained from the UP Speech Corpus are to be segmented by phonemes. For the actual system, input way files undergo the speech process module for the translation of the inputs into frames and are fed to the word segmentation module which then goes to a feature extraction module. The feature extraction module computes for feature vectors which would serve as inputs to the neural network. After training the networks to a specified target, their outputs would then be used as inputs to the probabilistic Hidden Markov Model [HMM] which would then predict the most possible sequence of outputs, in this case, the phoneme sequence. A decoder would then translate the phoneme sequence into a sequence of letters that form the word used in the lookup table to search for the best likely match of the recognized word.

Abstract Format

html

Language

English

Format

Print

Accession Number

TU13959

Shelf Location

Archives, The Learning Commons, 12F, Henry Sy Sr. Hall

Physical Description

1 v. (various foliations) : col. ill. ; 28 cm.

Keywords

Automatic speech recognition; Speech processing systems--Computer programs

This document is currently not available here.

Share

COinS