서지주요정보
Korean to Korean sign language translation via graph generation = 그래프 생성 기반 한국어에서 한국수어로의 기계번역
서명 / 저자 Korean to Korean sign language translation via graph generation = 그래프 생성 기반 한국어에서 한국수어로의 기계번역 / Jung-Ho Kim.
발행사항 [대전 : 한국과학기술원, 2023].
Online Access 원문보기 원문인쇄

소장정보

등록번호

8040335

소장위치/청구기호

학술문화관(문화관)B1층 보존서고

DCS 23004

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

Sign language is a spatial and multi-channel language, but existing sign language translation (SLT) models have taken into account only sequential information of sign language words. As a result, the translated sign language sequence loses its spatial and non-manual information and can not fully convey the meaning of the sequence. The thesis claimed herein is that the translation model must understand spatial and non-manual information centered around manual information to generate a complete sign language expression from a spoken sentence. To understand and generate this, we represent a KSL expression as a graph form and formulate SLT as a sequence-to-graph (seq2graph) learning problem. Through experiments, we analyze the strengths and weaknesses of the sequence-to-sequence (seq2seq) SLT methods and compare the performance of the seq2graph SLT method to that of seq2seq SLT methods. To compare the performance with the same criteria, we propose a new metric, Sign Language Evaluation Understudy (SLEU), to measure not only sequential information accuracy but also spatial and non-manual information accuracy. As a result of the experiment, the seq2graph SLT model is shown to perform 31.9% better than the best-performed seq2seq SLT model. In the future, we anticipate that the results of this study will be used in areas where there is a high demand for sign language interpretation by the Deaf, such as daily life conversations, broadcasting, and the Internet.

수어는 공간언어 및 다층언어임에도 불구하고 기존 번역 모델들은 수어 단어의 순차적인 정보만을 고려하여 번역해왔다. 그 결과, 번역된 수어 문장은 공간 정보 및 비수지 정보가 소실되어 문장의 의미를 완전히 생성하지 못했다. 본 논문에서는 번역 모델이 완전한 의미의 수어 표현을 생성하기 위해서는 수지 신호와 동기화된 공간 및 비수지 신호를 이해하고 생성해야함을 주장한다. 이를 이해하고 생성하기 위해 한국수어 표현을 그래프 형태로 표현하고 한국어 문장에서 한국수어 그래프를 생성하는 시퀀스 대 그래프 기반 수어 번역 방안을 제안한다. 실험을 통해 시퀀스 대 시퀀스 기반 수어 번역 방안의 강점과 약점을 분석하고, 시퀀스 대 그래프 수어 번역 방안과 성능을 비교한다. 동일한 정량적 지표로 성능을 분석하기 위해 수어의 순차적 정보 일치와 공간 및 비수지 정보의 일치까지 동시에 평가할 수 있는 Sign Language Evaluation Understudy (SLEU)라는 새로운 수어 번역 평가 지표를 제안한다. 실험 결과, 시퀀스 대 그래프 번역 모델이 가장 우수한 성능을 보인 시퀀스 대 시퀀스 번역 모델보다 31.9% 우수한 성능을 보였다. 본 연구의 결과물은 일상생활, 방송 및 인터넷 분야와 같이 농인들의 수어 통역 수요가 많은 영역에 활용될 것으로 기대한다.

서지기타정보

서지기타정보
청구기호 {DCS 23004
형태사항 iv, 62 p. : 삽도 ; 30 cm
언어 영어
일반주기 저자명의 한글표기 : 김정호
지도교수의 영문표기 : Jong Cheol Park
지도교수의 한글표기 : 박종철
Including appendix
학위논문 학위논문(박사) - 한국과학기술원 : 전산학부,
서지주기 References : p. 54-59
주제 Korean
Korean sign language
Sign language translation
Sequence-to-graph learning
Machine translation
한국어
한국수어
수어 번역
시퀀스 대 그래프 학습
기계 번역
QR CODE

책소개

전체보기

목차

전체보기

이 주제의 인기대출도서

An example of sign language expression as multi-channel language excerpted from the dataset [22]. Annotators translated the Korean sentence "기록적 폭우로 산사태 위험지역 접근금지 및 산림 인근 주민은 지정된 장소로 즉시 대피하여 주시기 바랍니다. (Due to the record heavy rain, access to the landslide risk area is prohibited, and residents near the forest are requested to evacuate immediately to the designated area.)' and then transcribed.

An example of sign language expression using a graph structure

Grammatical roles for the eight types of transcribed NMSs

A summary of the large-scale sign language datasets.

Distribution rates by the number of word occurrences of the German-DGS parallel corpus

Statistics of the scripts of three drama series

Distribution rates per word frequency on the scripts of three drama series and combinations of drama series

Examples of Korean and KSL sentences

A transcription parsing process for a KSL sequence

Examples of Korean and KSL sentences. *Corpus construction for the drama "Are You Human?'s is in progress.

A distribution of transcription errors by error type

An overview of our translation system. First, the named entity transformation module transforms input sentences (see (A)). Next, a PLM-based Encoder (e.g., BERT encodes the transformed

An example of fingerspelling word transformation

Representative expressions per number type with examples

Using a PLM as an encoder

Statistics of training, validation, and test subsets for the ETRI Ko-KSL dataset. OOV is the abbreviation of "out-of-vocabulary'. *The number of words is calculated by splitting sentences in morpheme-level.

Statistics of training, validation, and test subsets for the NIASL2021 Dataset. OOV is the abbreviation of"out-of-vocabulary', *The number of words is calculated by tokenizing sentences using the same PLM tokenizer for our SLT models.

Translation scores for all models on the validation/test subsets of the ETRI Ko-KSL dataset. "NE' is the abbreviation of Named Entity. The figure in bold indicates the top score by the respective metric.

Translation scores for all models on the validation /test subsets of the NIASL2021 Dataset. "NE' is the abbreviation of Named Entity. The figure in bold indicates the top score by the respective metric.

Translation scores by increasing the size of parallel data

Comparison oftranslation scores ofPLMs. L denotes the number oflayers ofa PLM and H denotes the number ofmulte-head attentions ofa PLM.

Translation scores depending on hyperparameter settings.

Qualitative results for all models

Pearson correlation coefficients between sentence-level BLEU-4 score and human evaluation scores (Accuracy & Comprehension)

An example ofsign language production using a commercial avatar player [26]

Naturalness and accuracy scores per category

Pearson correlation coefficients between sentence-level BLEU-4 score and human evaluation scores (Naturalness & Accuracy).

Average identification rates by participants for named entities

Four representative types of errors

An example of BFS-based graph linearization

Special tokens added the vocabulary of our translation model

An example of extracting n-grams from a graph

An example of converting a sequence to a directed graph (edges are only allowed from left to right nodes)

Statistics of training, validation, and test subsets for the NIASL2021 Dataset. OOV is the abbreviation of"out-of-vocabulary'' *The number of words is calculated by tokenizing sentences using the same PLM tokenizer for our SLT models.

Translation scores for all models on the validation/test subsets of the ETRI Ko-KSL dataset. "NE' is the abbreviation of Named Entity. Thefigure in bold indicates the top score by the respective metric.

Case 1. The seq2graph model successfully generates a gloss sequence with non-manual signals.

Case 2. The seq2graph model successfully generates a gloss sequence with spatial and non-manual signals

Error case 1: node misprediction

Error case 2: syntactic difference