한국과학기술원 도서관

서지주요정보
Improvement of speaker identification systems using candidate selection and likelihood ratio normalisation = 후보선정과 우도비 정규화를 이용한 화자식별 시스템의 성능향상
서명 / 저자	Improvement of speaker identification systems using candidate selection and likelihood ratio normalisation = 후보선정과 우도비 정규화를 이용한 화자식별 시스템의 성능향상 / Ji-Hwan Kim.
발행사항	[대전 : 한국과학기술원, 1998].
Online Access	원문보기 원문인쇄

소장정보

등록번호

8008905

소장위치/청구기호

학술문화관(문화관) 보존서고

MCS 98012

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

등록번호

9004646

소장위치/청구기호

서울 학위논문 서가

MCS 98012 c. 2

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

Speaker identification is the selection of the best matched speaker with input speech among the enrolled speakers. Speaker identification is mainly used in telephone services since it uses only speech as its input. In real environments, correct speaker identification is difficult for two main reasons. First, the number of enrolled speakers is large. In this case, subspaces which are represented by each speaker model can be covered by subspaces by other speaker models. Second, mis-matches occur between speaker models and input speech due to: insufficient training data, mis-matches between training and testing environments, and the effects of noise. Therefore, we need normalisation and scoring methods which will reduce the number of mis-matches. As a solution for the overlapping of speaker subspaces, this thesis proposes a confidence measure based on significance testing in order to select candidates for identification results. If the obtained confidence value from input by this measure is greater than the predefined threshold, the identification system accepts the identification result. If the obtained confidence value is less than the threshold for the client set, it rejects the identification result and selects the proper candidates. This thesis also proposes a scoring method which eliminates the frames which have a lower average rank of selected candidates after candidate selection, as a solution for mis-matches between speaker model and input speech. As a result, every speaker has the same selected frames when calculating the normalised score. In order to verify whether the proposed confidence measure accepts or rejects correctly, identification rates from all of the inputs and those inputs exceeding the pre-defined confidence level are compared. Those inputs exceeding the pre-defined confidence level (0.95) show an average of 28.71 percent higher identification rates than that of all inputs. In order to verify the candidate selection method, identification rates from all of the inputs and the probability that an input speaker exists among candidates are compared. The probability that an input speaker exists among candidates shows an average of 10.44 percent higher than identification rates from all of the inputs. The proposed scoring and normalisation method with candidates were compared with other scoring and normalisation methods. The proposed method shows an average of 2.78 percent higher identification rate than the conventional method when many client speakers and small training data were used.

서지기타정보

서지기타정보
청구기호	{MCS 98012
형태사항	v, [58] p. : 삽화 ; 26 cm
언어	영어
일반주기	저자명의 한글표기 : 김지환 지도교수의 영문표기 : Yung-Hwan Oh 지도교수의 한글표기 : 오영환
학위논문	학위논문(석사) - 한국과학기술원 : 전산학과,
서지주기	Includes reference

QR CODE

책소개

전체보기

나의 도서관정보

메뉴

소장정보

리뷰정보

초록정보

서지기타정보

책소개

목차

이 주제의 인기대출도서