서지주요정보
Improvement of speaker identification systems using candidate selection and likelihood ratio normalisation = 후보선정과 우도비 정규화를 이용한 화자식별 시스템의 성능향상
서명 / 저자 Improvement of speaker identification systems using candidate selection and likelihood ratio normalisation = 후보선정과 우도비 정규화를 이용한 화자식별 시스템의 성능향상 / Ji-Hwan Kim.
저자명 Kim, Ji-Hwan ; 김지환
발행사항 [대전 : 한국과학기술원, 1998].
Online Access 원문보기 원문인쇄

소장정보

등록번호

8008905

소장위치/청구기호

학술문화관(문화관) 보존서고

MCS 98012

SMS전송

도서상태

이용가능

대출가능

반납예정일

등록번호

9004646

소장위치/청구기호

서울 학위논문 서가

MCS 98012 c. 2

SMS전송

도서상태

이용가능

대출가능

반납예정일

초록정보

Speaker identification is the selection of the best matched speaker with input speech among the enrolled speakers. Speaker identification is mainly used in telephone services since it uses only speech as its input. In real environments, correct speaker identification is difficult for two main reasons. First, the number of enrolled speakers is large. In this case, subspaces which are represented by each speaker model can be covered by subspaces by other speaker models. Second, mis-matches occur between speaker models and input speech due to: insufficient training data, mis-matches between training and testing environments, and the effects of noise. Therefore, we need normalisation and scoring methods which will reduce the number of mis-matches. As a solution for the overlapping of speaker subspaces, this thesis proposes a confidence measure based on significance testing in order to select candidates for identification results. If the obtained confidence value from input by this measure is greater than the predefined threshold, the identification system accepts the identification result. If the obtained confidence value is less than the threshold for the client set, it rejects the identification result and selects the proper candidates. This thesis also proposes a scoring method which eliminates the frames which have a lower average rank of selected candidates after candidate selection, as a solution for mis-matches between speaker model and input speech. As a result, every speaker has the same selected frames when calculating the normalised score. In order to verify whether the proposed confidence measure accepts or rejects correctly, identification rates from all of the inputs and those inputs exceeding the pre-defined confidence level are compared. Those inputs exceeding the pre-defined confidence level (0.95) show an average of 28.71 percent higher identification rates than that of all inputs. In order to verify the candidate selection method, identification rates from all of the inputs and the probability that an input speaker exists among candidates are compared. The probability that an input speaker exists among candidates shows an average of 10.44 percent higher than identification rates from all of the inputs. The proposed scoring and normalisation method with candidates were compared with other scoring and normalisation methods. The proposed method shows an average of 2.78 percent higher identification rate than the conventional method when many client speakers and small training data were used.

서지기타정보

서지기타정보
청구기호 {MCS 98012
형태사항 v, [58] p. : 삽도 ; 26 cm
언어 영어
일반주기 저자명의 한글표기 : 김지환
지도교수의 영문표기 : Yung-Hwan Oh
지도교수의 한글표기 : 오영환
학위논문 학위논문(석사) - 한국과학기술원 : 전산학과,
서지주기 Includes reference
주제 Speaker identification
Normalisation
Candidate selection
화자식별
정규화
후보선정
QR CODE qr code