서지주요정보
한국어/영어 병렬 코퍼스에 대한 단어단위 및 구단위 정렬 모델 = Aligning a parallel Korean-English corpus at word and phrase level
서명 / 저자 한국어/영어 병렬 코퍼스에 대한 단어단위 및 구단위 정렬 모델 = Aligning a parallel Korean-English corpus at word and phrase level / 신중호.
저자명 신중호 ; Shin, Jung-Ho
발행사항 [대전 : 한국과학기술원, 1996].
Online Access 원문보기 원문인쇄

소장정보

등록번호

8006321

소장위치/청구기호

학술문화관(문화관) 보존서고

MCS 96021

SMS전송

도서상태

이용가능

대출가능

반납예정일

등록번호

9002758

소장위치/청구기호

서울 학위논문 서가

MCS 96021 c. 2

SMS전송

도서상태

이용가능

대출가능

반납예정일

초록정보

A parallel corpus is a set of multilingual texts of the same content. Study on parallel corpus leads to the acquisition of linguistic resources such as bilingual dictionary, bilingual grammars and translation examples. Alignment refers to the establishment of the correspondences between matching elements in parallel corpus. The methods that made use of collocation probability and relative positions for English/French alignment do not directly apply to the case of Korean/English. This is due to the fact that the matching unit of Korean and English is more variable and the word order clue is not reliable. This thesis presents a method that overcomes the two problems of Korean/English alignment. For the unit matching problem, we extended the alignment unit from words to phrases. We also associated functional words of Korean with the positions of English, which captures the position clue. We use the EM algorithm to estimate parameters of our model and we propose an alignment algorithm based on dynamic programming. Experiments were carried out on 253,000 English words and its Korean translations. The result shows that the proposed model achieves 68.7% accuracy at phrase level and 89.2% accuracy of bilingual dictionary induced from the alignment.

서지기타정보

서지기타정보
청구기호 {MCS 96021
형태사항 iv, 47 p. : 삽도 ; 26 cm
언어 한국어
일반주기 부록 : A, Table of notation. - B, 학습식 유도. - C, 정렬수행결과
저자명의 영문표기 : Jung-Ho Shin
지도교수의 한글표기 : 최기선
지도교수의 영문표기 : Key-Sun Choi
학위논문 학위논문(석사) - 한국과학기술원 : 전산학과,
서지주기 참고문헌 : p. 40-42
주제 정렬
확률적 기계번역
병렬 코퍼스
Alignment
Corpus linguistics
Machine translation
Parallel corpus
QR CODE qr code