한국과학기술원 도서관

서지주요정보
Unconditional image-text pair generation with multimodal cross quantizer = 다중모달 벡터 퀀타이저를 사용한 조건없는 이미지-텍스트 쌍 생성
서명 / 저자	Unconditional image-text pair generation with multimodal cross quantizer = 다중모달 벡터 퀀타이저를 사용한 조건없는 이미지-텍스트 쌍 생성 / Hyungyung Lee.
발행사항	[대전 : 한국과학기술원, 2022].
Online Access	원문보기 원문인쇄

소장정보

등록번호

8039734

소장위치/청구기호

학술문화관(도서관)2층 학위논문

MAI 22032

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

Though deep generative models have gained a lot of attention, most of the existing works are designed for the unimodal generation task. In this paper, we explore a new method for unconditional image-text pair generation. We propose MXQ-VAE, a vector quantization method for multimodal image-text representation. MXQ-VAE accepts a paired image and text as input, and learns a joint quantized representation space, so that the image-text pair can be converted to a sequence of unified indices. Then we can use autoregressive generative models to model the joint image-text representation, and even perform unconditional image-text pair generation. Extensive experimental results demonstrate that our approach effectively generates semantically consistent image-text pair and also enhances meaningful alignment between image and text.

심층 생성 모델이 많은 관심을 얻었지만, 대부분의 기존 연구들은 단일모달 생성을 위해 설계되었다. 본 논문에서는 조건없는 이미지-텍스트 쌍 생성을 위한 새로운 방법을 탐구한다. 우리는 멀티모달 이미지 텍스트 표현을 위한 벡터 양자화 방법인 MXQ-VAE를 제안한다. MXQ-VAE는 한 쌍의이미지와 텍스트를 입력으로 받아들이고 이미지 텍스트 쌍을 일련의 통일된 인덱스로 변환할 수 있도록 공동 양자화된 표현 공간을 학습한다. 그런 다음 자기 회귀 생성 모델을 사용하여 공동의 이미지 텍스트 표현을 모델링할 수 있으며, 조건 없는 이미지 텍스트 쌍 생성도 수행할 수 있다. 다양한 실험 결과는 우리의 접근 방식이 의미적으로 일관된 이미지-텍스트 쌍을 효과적으로 생성하고 이미지와 텍스트 사이의 의미 있는 정렬을 향상시킨다는 것을 보여준다.

서지기타정보

서지기타정보
청구기호	{MAI 22032
형태사항	iii, 13 p. : 삽도 ; 30 cm
언어	영어
일반주기	저자명의 한글표기 : 이현경 지도교수의 영문표기 : Edward Choi 지도교수의 한글표기 : 최윤재 수록잡지명 : "ICLR 2022 workshop on Deep Generative Models for Highly Structured Data". Unconditional Image-Text Pair Generation With Multimodal Cross Quantizer,
학위논문	학위논문(석사) - 한국과학기술원 : 김재철AI대학원,
서지주기	References : p. 12-13
주제	Multimodal Representation Learning Vector Quantization Unconditional Multimodal Generation 다중모달 특징 학습 벡터 퀀타이제이션 조건없는 다중모달 생성

QR CODE

책소개

전체보기

나의 도서관정보

메뉴

소장정보

리뷰정보

초록정보

서지기타정보

책소개

목차

이 주제의 인기대출도서