서지주요정보
확률 발음 사전을 이용한 대어휘 연속 음성 인식 = Large vocabulary continuous speech recognition using stochastic pronunciation lexicon modeling
서명 / 저자 확률 발음 사전을 이용한 대어휘 연속 음성 인식 = Large vocabulary continuous speech recognition using stochastic pronunciation lexicon modeling / 윤성진.
발행사항 [대전 : 한국과학기술원, 1999].
Online Access 제한공개(로그인 후 원문보기 가능)원문

소장정보

등록번호

8010289

소장위치/청구기호

학술문화관(문화관) 보존서고

DCS 99023

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

등록번호

9006270

소장위치/청구기호

서울 학위논문 서가

DCS 99023 c. 2

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

It is necessary for a computer to recognize large vocabulary continuous speech, to provide the most convenient way of communication for users. Recently, hidden Markov model(HMM) has become the predominant approach to speech recognition. The performance of an HMM-based speech recognizer depends on all components of speech processing. In this study, we present a acoustic model to improve the output probability modeling capability in an HMM-based acoustic modeling and a lexicon model to effectively represents variations in word pronunciation. In order to evaluate the recognition performance, the proposed methods have been tested on the large vocabulary Korean continuous speech recognition system with a vocabulary of 3,064 words. First, we study the method to estimate robust output probability distributions by using only a small amount of training data. In the HMM-based approach, the maximum likelihood estimate of the parameters converges to the true values as the number of training data tends to infinity. If the training data are limited, this will result in some parameters being inadequately trained and the classification based on the poorly trained model will result in fatal error. The proposed methods based on a HMM improves modeling of output probability modeling. The basic idea is that the proposed HMVQM uses the state dependent VQ codebook, and each state represents a partition of specific acoustic space. This approach can be reduced the size of model and improved the model accuracy. The proposed HMVQM were compared to discrete HMM-based continuous speech recognition through speaker-independent mode. The experimental results indicated that the proposed methods reduced the word error by 57.9% and sentence error by 60.6%. Second, we study a method for deriving a stochastic representation of a word baseform from sample utterances. Most large vocabulary speech recognizers employ subwords as basic recognition units. This implies that in order to obtain word (or sentence) recognition, a lexicon which defines the composition rules of the words in terms of basic units must be made available to the recognizer. In general, the lexicon is commonly created by the use of expert knowledge or a standard pronunciation dictionary. These approaches have some problems, e.g., speakers pronunciation variations in many different dialects need to be represented by one or multiple lexical entries. However, the standard pronunciation of a word doesn't have to do with the actual realization of the word, especially in continuous speech recognition. We have described the stochastic lexicon model which allows for pronunciation variations in speech recognition. In this lexicon model, the baseform of words is represented by a hidden Markov model with probability distributions of subword units. This lexicon is automatically trained from sample sentence utterances. Additionally, the stochastic baseform is further optimized to the subword model and recognizer. The proposed stochastic lexicon was compared to a conventional lexicon with a single baseform. In this experiments, the use of a stochastic lexicon reduced the word error by 53.6% and sentence error by 32.9%.

서지기타정보

서지기타정보
청구기호 {DCS 99023
형태사항 vi, 148 p. : 삽화 ; 26 cm
언어 한국어
일반주기 부록 수록
저자명의 영문표기 : Seong-Jin Yun
지도교수의 한글표기 : 오영환
지도교수의 영문표기 : Yung-Hwan Oh
학위논문 학위논문(박사) - 한국과학기술원 : 전산학과,
서지주기 참고문헌 : p. 109-113
QR CODE

책소개

전체보기

목차

전체보기

이 주제의 인기대출도서