In this thesis work, an unvoiced Korean phoneme recognition algorithm is described as a pre-processing stage for continuous Korean speech recognition. The proposed phoneme recognition algorithm consists of two processes: the/voiced/unvoiced/silence (V/UV/S) detection, and the phoneme segmentation and labeling.
Input speech signal is first segmented into V/UV/S intervals using a pattern matching method based on the statistical decision theory. From the segmented V/UV/S intervals, the extracted unvoiced speech signal intervals are labeled to make unvoiced phoneme sequences. For the phoneme-level segmentation and labeling, we used vector quantizatio n(VQ) and hidden Markov modeling (HMM).
Computer simulation has been done to obtain the performance of the proposed unvoiced Korean phoneme recognition algorithm using 200 word vocabularies consisting of names of universities, hospitals, and public offices in Seoul. The vocabularies are spoken by 6 male speakers under an ordinary ambient condition.
Simulation results show that unvoiced interval detection error rates in the /V/UV/S detection process are about 2.4% and 30% when unvoiced phonemes occur at the first syllables of spoken words or not, respectively. VQ and HMM are then applied to detected unvoiced speech segments to recognize unvoiced phonemes in the signal segments. When 4 HMM states and 128 VQ codewords are used, and considering top 2 candidates, the unvoiced phoneme recognition accuracy about 73.4% is obtained. And when the transition part between unvoiced and voiced interval is included in the unvoiced interval, the result shows about 82% recognition accuracy with the same condition as above.