한국과학기술원 도서관

서지주요정보
3차원 음향 인텐시티를 사용한 심층신경망 기반 음성 잔향 제거 기법 = Speech dereverberation using 3-dimensional acoustic intensity based on deep neural networks
서명 / 저자	3차원 음향 인텐시티를 사용한 심층신경망 기반 음성 잔향 제거 기법 = Speech dereverberation using 3-dimensional acoustic intensity based on deep neural networks / 유정민.
발행사항	[대전 : 한국과학기술원, 2020].
Online Access	원문보기 원문인쇄

소장정보

등록번호

8036075

소장위치/청구기호

학술문화관(문화관) 보존서고

MEE 20061

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

In object-based audio system, information about sound sources is separated from information about a reverberant room so that a renderer reproduces realistic and interactive sound. To implement this system, reverberation by the room should be removed from a signal recorded by a microphone array. Removal of reverberation is called dereverberation. Recent studies has shown deep neural networks (DNNs) can learn the mapping from multi-channel reverberant signals to anechoic speech signals. However, those learning-based methods can suffer from generalization problems: their performance strongly depends on rooms used for the training data. In this work, we propose directional features related to 3-dimensional acoustic intensity as the input feature of DNN. Because acoustic intensity can provide useful information on the degree of reverberation and directions of reflected waves, it enables the construction of a more generalized dereverberation DNN model. When the DNN model is trained by directional features, the quality, intelligibility, and signal-to-noise ratio of the output speech are improved as compared to the same model trained by multi-channel magnitude-phase spectrum directly. Especially, the performance improvement for the test signals recorded in a different room demonstrates that we can build a more general model with the proposed directional features.

객체 기반 오디오 시스템은 마이크로폰 어레이로 녹음된 신호에서 개별 음원에 대한 정보와 공간에 대한 정보를 따로 분리하여 저장한 후 렌더링 시에 음장을 현장과 유사하게 재현하는 시스템이다. 이를 구현하기 위해 녹음된 신호에서 잔향을 제거할 필요가 있다. 최근에 심층신경망이 멀티채널 신호를 입력으로 받아 잔향을 제거하는 기법이 좋은 성능을 보였다. 그러나 그러한 학습 기반 기법은 훈련에 사용되지 않은 방에서 잔향 제거 성능이 떨어지는 문제가 있다. 본 연구에서는 그러한 문제를 완화하고 일반화 성능이 좋은 심층신경망을 훈련하기 위해 방향 특징을 입력으로 사용할 것을 제안한다. 방향 특징은 3차원 음향 인텐시티와 관련된 값이며, 잔향의 양이나 직접파와 반사파의 방향을 판단하는 데에 유용한 정보이다. 방향 특징을 입력으로 훈련된 심층신경망은 멀티채널 신호로 훈련된 심층신경망보다 더 좋은 품질과 명료도, 신호 대 잡음비를 갖는 음성을 출력한다. 특히, 훈련에 사용되지 않은 방에서 성능 하락이 발생하는 정도가 작아서 다양한 환경에 일반화될 수 있다.

서지기타정보

서지기타정보
청구기호	{MEE 20061
형태사항	v, 62 p. : 삽화 ; 30 cm
언어	한국어
일반주기	저자명의 영문표기 : Jeongmin Liu 지도교수의 한글표기 : 최정우 지도교수의 영문표기 : Jung-Woo Choi
학위논문	학위논문(석사) - 한국과학기술원 : 전기및전자공학부,
서지주기	참고문헌 : p. 54-60

QR CODE

책소개

전체보기

나의 도서관정보

메뉴

소장정보

리뷰정보

초록정보

서지기타정보

책소개

목차

이 주제의 인기대출도서