서지주요정보
통계적 화행처리를 이용한 한-영 대화체 기계번역에서의 효율적인 대화분석 = An efficient dialogue analysis model with statistical speech act processing for Korean-English dialogue machine translation
서명 / 저자 통계적 화행처리를 이용한 한-영 대화체 기계번역에서의 효율적인 대화분석 = An efficient dialogue analysis model with statistical speech act processing for Korean-English dialogue machine translation / 이재원.
발행사항 [대전 : 한국과학기술원, 1999].
Online Access 원문보기 원문인쇄

소장정보

등록번호

8009913

소장위치/청구기호

학술문화관(문화관) 보존서고

DCS 99010

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

등록번호

9006225

소장위치/청구기호

서울 학위논문 서가

DCS 99010 c. 2

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

In some cases, to make a proper translation of an utterance in a dialogue, different pieces of contextual information are needed. Interpreting such utterances often requires dialogue analysis including speech acts and discourse analysis. Recently, a machine learning approach has received considerable attention because it has some attractions such as automatic learning, wide-coverage, and robustness. In this thesis, a statistical dialogue analysis model for Korean-English dialogue machine translation based on speech acts is proposed. First, we study the method to annotate dialogue utterances with speech acts and discourse structure. The main purpose of our annotation is to build a dialogue corpus for training statistical dialogue models. At the level of speech act, we use a set of 15 domain independent speech acts. At the level of discourse structure, we annotate shallow discourse structure based on speech acts. The fact that some utterances are responses to previous utterances can be reflected by supporting a way to indiate a backward link to the previous utterance. Therefore, this will capture some aspects of discourse structure for dialogue. Second, we propose a statistical dialgoue model based on speech acts. The model uses syntactic patterns and n-grams of speech acts. The syntactic patterns include surface syntactic features which are related to the language-dependent expressions of speech acts. Speech act n-grams are used to approximate the context of utterances. The key feature of our approach is to use n-grams of speech act based on hierarchical recency. Our experimental results with trigrams show that the proposed model achieves an accuracy of 66.87% for the top candidate and 82.35% for the top three candidates. It indicates that the proposed model based on hierarchical recency outperforms the model based on linear recency. In this thesis, we have focused on defining an approach using a minimal knowledge-base capable of handling ambiguous utterances. Previous conventional approaches use lots of domain knowledge combined with discourse knowledge. Although such approaches can make relatively correct analysis in the process of understanding dialogues, they are difficult to be scaled-up. Therefore, we proposed a statistical dialogue analysis model that can be more easily scaled-up with the expense of losing complete understanding of dialogues. This kind of trade-off is quite reasonable since machine translation usually does not require complete understanding of an utterance for translation. We believe that this kind of statistical approach can be easily integrated with other approaches for an efficient and robust dialogue analysis.

서지기타정보

서지기타정보
청구기호 {DCS 99010
형태사항 [vii], 84 p. : 삽화 ; 26 cm
언어 한국어
일반주기 저자명의 영문표기 : Jae-Won Lee
지도교수의 한글표기 : 김길창
지도교수의 영문표기 : Gil-Chang Kim
학위논문 학위논문(박사) - 한국과학기술원 : 전산학과,
서지주기 참고문헌 : p. 66-72
QR CODE

책소개

전체보기

목차

전체보기

이 주제의 인기대출도서