한국과학기술원 도서관

서지주요정보
Translating Hanja historical documents to contemporary Korean and English = 인공신경망을 통한 조선왕조실록 신역 및 영문 번역
서명 / 저자	Translating Hanja historical documents to contemporary Korean and English = 인공신경망을 통한 조선왕조실록 신역 및 영문 번역 / Juhee Son.
발행사항	[대전 : 한국과학기술원, 2023].
Online Access	원문보기 원문인쇄

소장정보

등록번호

8040838

소장위치/청구기호

학술문화관(도서관)2층 학위논문

MCS 23023

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

The Annals of Joseon Dynasty (AJD) contain the daily records of the Kings of Joseon, the 500-year kingdom preceding the modern nation of Korea. The Annals were originally written in an archaic Korean writing system, `Hanja', and were translated into Korean from 1968 to 1993. The resulting translation was however too literal and contained many archaic Korean words; thus, a new expert translation effort began in 2012. Since then, the records of only one king have been completed in a decade. In parallel, expert translators are working on English translation, also at a slow pace and produced only one king's records in English so far. Thus, we propose H2KE, a neural machine translation model, that translates historical documents in Hanja to more easily understandable Korean and to English. Built on top of multilingual neural machine translation, H2KE learns to translate a historical document written in Hanja, from both a full dataset of outdated Korean translation and a small dataset of more recently translated contemporary Korean and English. We compare our method against two baselines: a recent model that simultaneously learns to restore and translate Hanja historical document and a Transformer based model trained only on newly translated corpora. The experiments reveal that our method significantly outperforms the baselines in terms of BLEU scores for both contemporary Korean and English translations. We further conduct extensive human evaluation which shows that our translation is preferred over the original expert translations by both experts and non-expert Korean speakers.

조선왕조실록은 500여 년 조선 왕조의 역사를 편찬한 사서이다. 이는 본래 한자로 쓰였으나 고전번역원에 의하여 한국어로 번역되었다 (1968-1993). 그러나 조선왕조실록의 번역본은 현대 사람들이 이해하기 어려운 한자어 직역과 고어를 포함하였기 때문에 오류 수정 및 가독성을 높이고자 신역 작업이 시작되었다. 전문가 번역은 오랜 시간이 걸리기 때문에, 2012년에 작업이 시작된 이래로 오직 정조실록만 신역 완료되었다. 동시에 조선왕조실록의 국제화를 위하여 영어 번역도 진행되었다, 신역 작업과 마찬가지로 오직 세종실록만 번역 완료되었다. 우리는 인공지능 번역 모델인 H2KE를 제안한다. H2KE는 조선왕조실록 구역, 신역, 영어 번역 데이터를 모두 활용하는 다국어 기계 번역 방법을 통해 한자로 쓰인 역사 문서 번역을 학습한다. 비교 모델은 다음과 같다: 최근에 제안된 한자 복원과 한자 문서 번역을 동시에 수행하는 모델, 조선왕조실록 신역 데이터만 사용하여 학습된 트랜스포머 기반 모델. 우리는 실험을 통해 우리 모델이 BLUE 점수 면에서 다른 두 모델을 뛰어넘는 것을 확인하였다. 또한 우리는 전문가/비전문가 평가를 통하여 우리의 번역이 본래의 구역보다 더 선호되는 것을 확인하였다.

서지기타정보

서지기타정보
청구기호	{MCS 23023
형태사항	iv, 20 p. : 삽도 ; 30 cm
언어	영어
일반주기	저자명의 한글표기 : 손주희 지도교수의 영문표기 : Alice Oh 지도교수의 한글표기 : 오혜연
학위논문	학위논문(석사) - 한국과학기술원 : 전산학부,
서지주기	References : p. 17-18
주제	Machine Learning Machine Translation Low resource Translation 머신러닝 기계번역 저자원 번역

QR CODE

책소개

전체보기

나의 도서관정보

메뉴

소장정보

리뷰정보

초록정보

서지기타정보

책소개

목차

이 주제의 인기대출도서