한국과학기술원 도서관

서지주요정보
Adaptive maximum entropy regularization for connectionist temporal classification = 연결주의적 시간 분류의 개선을 위한 적응형 최대 엔트로피 정규화
서명 / 저자	Adaptive maximum entropy regularization for connectionist temporal classification = 연결주의적 시간 분류의 개선을 위한 적응형 최대 엔트로피 정규화 / SooHwan Eom.
발행사항	[대전 : 한국과학기술원, 2024].
Online Access	원문보기 원문인쇄

소장정보

등록번호

8042197

소장위치/청구기호

학술문화관(도서관)2층 학위논문

MEE 24085

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

This dissertation focuses on Connectionist Temporal Classification (CTC), a fundamental sequence-to-sequence learning method that leverages dynamic programming for mapping input to output sequences. While CTC has played a pivotal role in sequence learning tasks such as automatic speech recognition (ASR) and optical character recognition (OCR), it is hindered by a persistent challenge—its tendency to generate overly narrow output predictions. To mitigate this challenge EnCTC incorporated an entropy maximization-based regularization term alongside the CTC loss. While EnCTC demonstrated its effectiveness in optical character recognition, it introduced a constant weighting factor for the regularization term during training, which could enforce unnecessary ambiguity even for correct predictions in the later stages of training and affect the overall performance. To address this issue, we present Adaptive Maximum Entropy Regularization (AdaMER), a novel approach that dynamically adjusts the impact of entropy regularization throughout the training process. This adjustment is achieved through the use of a gradient-based learnable parameter that serves as the regularization weighting factor. Our experiments, conducted on the LibriSpeech corpus and various OCR benchmark real-world datasets, provide empirical evidence of the efficacy of AdaMER in addressing the challenges associated with CTC-based sequence learning, ultimately improving model performance.

본 논문에서는 연결주의적 시간 분류에 대해 다룬다. 연결주의 시간 분류는 동적 프로그래밍을 통해 입력과 출력 시계열 간의 매핑을 배우는 학습 알고리즘으로 음성 인식, 문자 인식과 같은 시계열 학습에 있어 중추적인 역할을 해왔다. 그러나 지나치게 좁은 예측을 생성하는 특성은 여전히 지속적인 과제로 남아있다. 이전 연구에서는 엔트로피 최대화에 기반한 정규화를 통하여 이 한계를 해결하려고 시도했지만, 훈련 과정에서 정규화 항에 일정한 가중치를 지속적으로 부여하기 때문에 훈련 후반 단계에서는 이러한 정규화가 오히려 학습 방해요소로 작용될 수 있다. 본 연구에서는 이러한 문제를 완화하기 위해 학습 가능한 파라미터를 정규화 가중치로 사용하여 훈련 과정 전반에 걸쳐 엔트로피 정규화의 영향을 동적으로 조정하는 적응형 최대 엔트로피 정규화 기법을 제시하고자 한다. 해당 기법은 음성 인식 데이터셋과 문자 인식 데이터셋을 사용한 실험에서 기존 연구와의 비교를 통해 우수성을 보여준다.

서지기타정보

서지기타정보
청구기호	{MEE 24085
형태사항	iv, 44 p. : 삽도 ; 30 cm
언어	영어
일반주기	저자명의 한글표기 : 엄수환 지도교수의 영문표기 : Chang D. Yoo 지도교수의 한글표기 : 유창동 Including appendix
학위논문	학위논문(석사) - 한국과학기술원 : 전기및전자공학부,
서지주기	References : p. 36-42
주제	인공 지능 심층 학습 음성 인식 문자 인식 연결주의적 시간 분류 Artificial Intelligence Deep Learning Automatic Speech Recognition Optical Character Recognition Connectionist Temporal Classification

QR CODE

책소개

전체보기

나의 도서관정보

메뉴

소장정보

리뷰정보

초록정보

서지기타정보

책소개

목차

이 주제의 인기대출도서