한국과학기술원 도서관

서지주요정보
Robust semi-supervised learning to label bias = 레이블 편향에 강인한 반지도학습 방법
서명 / 저자	Robust semi-supervised learning to label bias = 레이블 편향에 강인한 반지도학습 방법 / Youngtaek Oh.
발행사항	[대전 : 한국과학기술원, 2021].
Online Access	원문보기 원문인쇄

소장정보

등록번호

8037187

소장위치/청구기호

학술문화관(문화관) 보존서고

MEE 21051

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

It is well known that label bias hinders the practical applications of deep learning approaches in the real-world. Since deep learning models optimally learn the biases existing in the dataset, they cannot generalize well to fair, real-world requirement. In real-world, especially in applications where safety and reliability are required, it is important to ensure that the model produces fair predictions even if the model is trained on a biased data. In this dissertation, we present a semi-supervised learning method that allows the model to be fair even in the training data where bias exists. When biased data is trained with typical semi-supervised learning methods, performances severely degrades, in some cases far less than that of the supervised learning counterpart. In this paper, we design a framework called Prototypical Semantic Alignment (PSA) to effectively prevent this problem. To this end, we propose 1) Online Clustering, an algorithm that clusters labeled data in an online-manner and 2) Semantic Alignment Loss which enforces the unlabeled images to be attracted to the most similar prototype obtained through clustering. As a result, we show significant performance gains by consuming additional unlabeled images in long-tailed distribution data and label-scarce scenarios, which are real examples of label bias.

데이터셋 편향은 딥러닝 방법론들의 현실 세계에서의 활용에 있어서 주된 걸림돌이라는 것이 잘 알려져 있다. 딥러닝 모델은 데이터셋에 존재하는 편향에 최적으로 학습하기 때문에 편향 이외의 정보 (unbiased)에 대해서는 잘 일반화하지 못한다. 현실 세계에서, 특히 안정성과 신뢰성이 요구되는 활용에 있어서 비록 모델이 편향이 존재하는 데이터셋으로 학습했을지라도 공평한 (fair) 추론을 만들어내도록 하는 것이 중요하다. 본 학위논문에서는 편향이 존재하는 학습 데이터셋 하에서도 모델이 고른 추론을 만들어내도록 하는 반지도학습 방법론을 제시한다. 일반적인 반지도학습 방법으로 편향된 데이터셋을 학습할 경우 지도학습 방법론의 결과와 유사하거나 훨씬 못 미치는 성능을 얻게 되는데, 본 논문에서는 Prototypical Semantic Alignment (PSA)라는 프레임워크를 설계하여 이러한 문제점을 효과적으로 방지한다. 이를 위해 1) 반지도학습 파이프라인 내 라벨이 있는 데이터셋을 실시간으로 군집화하는 알고리즘 (Online Clustering)과 2) 라벨이 없는 이미지들 각각이 군집화를 통해 얻어낸, 그와 가장 유사한 대표치들로 끌리게 만드는 손실함수 (Semantic Alignment Loss)를 제안한다. 그 결과 편향이 존재하는 데이터셋의 실제 예시 케이스인 긴꼬리 형태의 카테고리 분포를 지니는 데이터셋과 각 카테고리마다 아주 소량의 라벨이 존재하는 데이터셋에 대해 라벨이 없는 이미지를 추가로 이용하여 유의미한 성능 향상을 나타낼 수 있음을 보였다.

서지기타정보

서지기타정보
청구기호	{MEE 21051
형태사항	v, 32 p. : 삽화 ; 30 cm
언어	영어
일반주기	저자명의 한글표기 : 오영택 지도교수의 영문표기 : In So Kweon 지도교수의 한글표기 : 권인소
학위논문	학위논문(석사) - 한국과학기술원 : 전기및전자공학부,
서지주기	References : p. 27-30

QR CODE

책소개

전체보기

나의 도서관정보

메뉴

소장정보

리뷰정보

초록정보

서지기타정보

책소개

목차

이 주제의 인기대출도서