In this thesis, we first examine performance of discrete, continuous, and semicontinuous hidden Markov models (HMM) for Korean digit recognition, and propose new modeling methods to improve the recognition performance by incorporating dependency characteristics between feature vectors or among feature vector components.
A baseline system using standard HMM techniques is established by examining feature vectors, codebook size for discrete HMM (DHMM), and the number of mixtures in continuous HMM (CHMM) to give an optimal benchmark performance on the Korean digit recognition task. The performance of the baseline system based on the semicontinuous HMM (SCHMM) is superior to those of DHMM and CHMM.
The different feature vectors used in speech recognition such as mel-cepstra and delta-cepstra are regarded independently with each other in conventional speech recognition systems, though they have some dependency relationships. With the codebook dependency modeling for those feature vectors, we can improve the recognition performance of DHMM for the multi-speaker dependent recognition task as well as for the speaker independent one.
We also obtain improved recognition performance of CHMM and SCHMM by reducing the dependency among the feature vector components.