한국과학기술원 도서관

서지주요정보
Two-channel sound source localization method for speech enhancement system = 음질 개선 시스템을 위한 두 채널 음원의 방향 추적 방법
서명 / 저자	Two-channel sound source localization method for speech enhancement system = 음질 개선 시스템을 위한 두 채널 음원의 방향 추적 방법 / Hye-Jeong Jeon.
발행사항	[대전 : 한국과학기술원, 2008].
Online Access	원문보기 원문인쇄

소장정보

등록번호

8019753

소장위치/청구기호

학술문화관(문화관) 보존서고

DCS 08010

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

Human can distinguish the desired speech signal in the environment of the signal and noise mixture. Many researchers have endeavored to analyze and implement this human`s hearing capability in signal processing fields, however still can`t get the satisfied result. With the developing of the network and the Information Technology, many researches for the extended speech communication and interface services such as the video telephony and speech recognition system are actively processed, however the service`s quality still remains as a problem. One of the major reasons is that the real environment with the background noise is quite different from the noiseless laboratory where systems can work properly. So the performance of the systems drops drastically in the real environment. The speech enhancement means that it can extract the desired signal from the one or more sources and the noise mixture. By using this technology as a pre-processor of the speech processing system such as the speech communication system or the speech recognition system, we can increase the system`s performance. The speech enhancement system is classified by two fields depending on the number of the input signals. The single channel speech enhancement system uses only one input signal and is very commonly commercialized. Multi-channel speech environment system uses multiple input signals. The single channel speech enhancement works poorly when it fails to estimate the environment`s noise information, and the multi-channel speech enhancement requires more hardware cost and computation capability to process the multiple input signals and works poorly when it fails to estimate the desired signal`s direction. Some of these problems can be resolved by detecting the voice activity or estimating the signal`s direction accurately. We propose two-step method of increasing the accuracy of direction-of-arrival (DOA) estimation by using two input signals to complement the problems of the current speech enhancement systems, and show the increased performance of the speech enhancement system by applying to the Sound Source Localization. First, use the reliability measure arises from a waterbed effect in DOA estimation. If the calculated reliability has a lower value than a predefined threshold, the probability of the speech presence is low so that the estimated DOA is regarded as an unreliable result and subsequently is discarded. By processing the reliable results only, we can decrease the failure rate of the environmental noise estimation and so increase the signal to noise ratio. Second, determine the well-fitted threshold for the environment. We propose to use the Maximum Likelihood and Neyman-Pearson decision rules for the threshold selection. These kinds of likelihood ratio tests depend on information about a probabilistic model of reliability. Using this reliability measure and the threshold selection, we can raise the performance of the multi-channel speech enhancement system by estimating more reliable DOA results for the multi-channel noise reduction. Experimental results showed the successful speech quality enhancement performance for simulated data as well as real environment recordings employing mixture of the signal and the background noise. The threshold measure and the threshold selection can reject perturbed results of the estimated DOA in the experiment. The proposed two-step method is expected to contribute the commercialization of the speech communication and interface system by suppressing the environmental noise.

사람은 음원과 잡음이 혼합된 환경에서도 원하는 음원 만을 구분하여 인식할 수 있다. 신호 처리 연구 분야에서 이러한 사람의 청각 능력을 다양한 방법으로 실현하기 위한 연구들을 진행해 왔으나, 많은 응용분야에서 만족스러운 결과를 얻지 못하고 있다. 이러한 상황은 음성을 이용하는 다양한 서비스 환경에서도 나타나고 있다. 네트워크와 정보 기술이 발전하면서 단순한 음성 통신 서비스를 넘어서 화상 통신 시스템과 음성 인식 시스템과 같이 음성 신호를 활용하는 서비스에 대한 연구들이 활발히 진행되고 있지만, 음질이 여전히 문제로 남아있다. 실험실의 잡음이 없는 환경 하에서는 잘 동작하는 시스템이 잡음이 존재하는 실제 환경 하에서 그 성능이 급격히 저하되기 때문이다. 음질 개선 기술은 한 개 이상의 입력 신호로 들어온 잡음이 포함된 신호로부터 원하는 음성 신호만을 추출하고, 이를 음성 통신, 음성 인식과 같은 음성 신호 처리 시스템의 후단에 적용하여 시스템의 성능을 향상시키는 기술이다. 일반적으로 음질 개선 기술은 단일 채널 음질 개선 시스템과 멀티 채널 음질 개선 시스템으로 나눌 수 있다. 단일 채널 음질 개선 시스템은 한 개의 입력 신호를 사용하여 음질을 개선하는 방법으로, 가장 많이 상용되고 있다. 멀티 채널 음질 개선 시스템은 여러 개의 입력 신호 사용하여 음질을 개선하는 방법이다. 그런데, 단일 채널 음질개선 시스템은 배경 잡음에 대한 정보의 정확한 추정의 실패로 발생하는 성능 저하의 문제가 있으며, 멀티 채널 음질 개선 시스템은 여러 입력 신호의 사용에 따른 재료 비용 및 처리를 위한 연산량 문제, 그리고 신호의 입사각 추정의 실패로 발생하는 성능 저하 문제가 있다. 이러한 문제점들은 음성 신호의 탐지나 음성의 위치 추정에 의하여 해결될 수 있다. 본 논문에서는 잡음 환경하에서의 음질 개선 시스템의 문제점을 보완할 수 있는 방법으로 두 채널의 입력 신호를 이용한 음원 신호의 방향 예측의 정확성을 높임으로써 음질 개선 시스템의 성능을 향상시키는 방법을 제안한다. 이 방법은 두 단계로 되어있다. 첫 번째 단계는 두 채널의 입력 신호를 이용하여 음원의 위치를 추적할 때, 통계적으로 음원 신호가 존재할 확률이 높은 대역을 선택하여 사용하는 것이다. 방향을 예측할 때 물침대 효과를 이용한 신뢰 수치를 이용하는 방법으로, 계산된 신뢰 수치가 미리 정의한 임계치보다 적을 경우에는 음성이 존재할 가능성이 낮아서 예측된 방향의 신뢰성이 떨어진다고 보고 결과에서 제외시킨다. 음성의 존재 가능성이 높은 대역을 처리함으로써 추정치의 실패로 인한 성능 저하를 줄이고 신호 대 잡음비를 높일 수 있다. 두 번째 단계는 환경에 잘 맞는 임계치를 결정하기 위해서 최대 가능치와 Neyman-Pearson 결정법을 사용하는 것이다. 이런 종류의 가능치는 신뢰도의 확률 모델에 대한 정보에 의해 테스트한다. 음원의 위치를 추적할 때, 통계적으로 신뢰할 수 없는 부정확한 위치 추정 결과들을 적절히 제거함으로써, 음원 신호의 위치 추정치를 높이는 것이다. 음원 신호가 존재하는 구역만 해당 위치 추정치의 방향에 해당하는 신호의 진폭을 높이고, 그 이외의 위치에는 배경 잡음으로 판단해서 신호의 진폭을 줄여줌으로써 음질 개선의 성능을 높일 수 있다. 제안된 방법들의 유효성을 보이기 위해, 음원과 배경 잡음의 혼합 신호에 대하여 음질 개선 시스템을 수행하였으며 모든 경우에 있어서 만족할만한 결과를 얻을 수 있었다. 실험에서 신뢰 수치와 임계치 결정의 두 단계로 구성된 제안 방법이 예측된 방향의 잘못된 결과들을 제거할 수 있다는 것을 볼 수 있다. 제안된 두 단계의 방법을 이용한 음질 개선 시스템은 화상 통신 시스템의 배경 잡음 제거, 음성 인식 시스템의 전처리 과정으로 적용되어 음성 처리 기술의 상용화에 기여할 수 있을 것으로 기대된다.

서지기타정보

서지기타정보
청구기호	{DCS 08010
형태사항	ix, 76 p. : 삽화 ; 26 cm
언어	영어
일반주기	저자명의 한글표기 : 전혜정 지도교수의 영문표기 : Hyun-Soo Yoon 지도교수의 한글표기 : 윤현수 수록잡지정보 : "Reliability measure for sound source localization". IEICE Electronics Express, v.5 no.6, pp. 192-197(2008)
학위논문	학위논문(박사) - 한국과학기술원 : 전산학전공,
서지주기	References : p. 72-76

QR CODE

책소개

전체보기

나의 도서관정보

메뉴

소장정보

리뷰정보

초록정보

서지기타정보

책소개

목차

이 주제의 인기대출도서