서지주요정보
Image restoration of mixed degradation using transformer-based two-stage U-Nets = 트랜스포머 기반 2단계 U-Net을 이용한 혼합 열화 영상 복원 연구
서명 / 저자 Image restoration of mixed degradation using transformer-based two-stage U-Nets = 트랜스포머 기반 2단계 U-Net을 이용한 혼합 열화 영상 복원 연구 / Taehwan Kim.
발행사항 [대전 : 한국과학기술원, 2024].
Online Access 원문보기 원문인쇄

소장정보

등록번호

8042226

소장위치/청구기호

학술문화관(도서관)2층 학위논문

MEE 24114

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

In the industry, to apply image restoration technology that converts low-quality images into high-quality ones, it is necessary to restore images with complex and mixed degradation. However, existing deep learning-based methods have been focused on restoring images with a single type of degradation. In particular, Transformer-based methods for image restoration of single degradation are effective but require a new approach for images with mixed degradation. In this thesis, we propose a model using Transformer-based two-stage U-Nets for image restoration of mixed degradation that is suitable for the real world. The proposed two-stage U-Net Transformer (TUT) effectively restores images with mixed degradation by dividing a complex problem into gray-scale image restoration and color-scale image restoration, thereby outperforming existing Transformer-based image restoration models. In particular, we designed spatial-wise Transformer-based U-Net for gray-scale image restoration and channel-wise Transformer-based U-Net for color-scale image restoration. This approach effectively performed image restoration of mixed degradation. Moreover, to address shortcomings of existing Transformer-based models and maximize performance, we introduced spatial-wise and channel-wise modulators. Additionally, various loss functions were used to optimize image restoration of mixed degradation. Lastly, we proposed synthetic data pre-processing techniques capable of representing both spatial and color degradation. This approach enabled more detailed representations of real-world degradation compared to an existing synthetic degradation dataset. Experimental results showed that the proposed TUT outperformed existing Transformer-based image restoration models on various evaluation datasets.

낮은 품질의 영상을 높은 품질의 영상으로 변환하는 영상 복원 기술을 산업에 적용하기 위해서는 복잡하고 혼합된 열화 영상 복원이 가능해야 한다. 그러나 기존의 딥러닝 기반의 방법들은 단일 열화 영상 복원에 초점을 맞추어서 연구되어 왔다. 특히, 트랜스포머 기반의 단일 열화 영상 복원 방법은 효과적이지만 혼합 열화 영상에 대해서는 새로운 연구가 필요하다. 본 연구에서는 현실 세계에 적합한 혼합 열화 영상 복원을 위하여 트랜스포머 기반의 2단계 U-Net을 이용한 모델을 제안한다. 제안한 2단계 U-Net 트랜스포머(TUT)는 복잡한 혼합 열화 영상 복원 문제를 흑백 영상 복원과 색 영상 복원으로 분할함으로써 기존의 트랜스포머 기반 영상 복원 모델들보다 효과적으로 복원하였다. 특히, 흑백 영상 복원을 위한 공간 방향의 트랜스포머 기반 U-Net과 색 영상 복원을 위한 채널 방향의 트랜스포머 기반 U-Net을 설계함으로써 효과적인 혼합 열화 영상 복원을 수행하게 하였다. 그리고 기존 트랜스포머 기반의 모델이 가지고 있는 단점을 보완하고 성능을 극대화하기 위하여 공간 방향의 조절자와 채널 방향의 조절자를 도입하였다. 또한, 다양한 손실함수를 사용하여 혼합 열화 영상 복원을 최적화하였다. 마지막으로, 공간 및 색 열화를 모두 표현할 수 있는 합성 데이터 전처리 기법을 제안하여 기존의 합성 열화 데이터셋에 비해서 현실 세계의 열화를 세밀하게 표현하였다. 실험 결과, 제안한 TUT는 다양한 평가 데이터셋에 대하여 기존의 트랜스포머 기반 영상 복원 모델들에 비해서 우수한 성능을 보였다.

서지기타정보

서지기타정보
청구기호 {MEE 24114
형태사항 iv, 37 p. : 삽도 ; 30 cm
언어 영어
일반주기 저자명의 한글표기 : 김태환
지도교수의 영문표기 : Munchurl Kim
지도교수의 한글표기 : 김문철
Including appendix
학위논문 학위논문(석사) - 한국과학기술원 : 전기및전자공학부,
서지주기 References : p. 31-35
주제 Image restoration
Mixed degradation
Transformer
Computer vision
Deep learning
영상 복원
혼합 열화
트랜스포머
컴퓨터 비전
딥러닝
QR CODE

책소개

전체보기

목차

전체보기

이 주제의 인기대출도서

Comparison of different pre-processing techniques adopted to DIV2K [4] dataset. Public degraded image looks like degraded by simple noise, blur, or compression artifacts. Our degraded image was degraded by not only noise and blur, but also narrowed color gamutor different light condition. Best viewed in zoom.

Comparison ofproposed degradation process and typical degradation process. Our degradation process consists ofmultiple different degradation Especially, color degradation is appliedinto our degradation process. Meanwhile, typical degradation process only contains three simple types ofdegradation.

Exampleimage with the proposed pre-processing techniques applied. Compared to ground truth image (a), our degraded image has narrowed color gamut (b): the color ofthe sky is blue in the ground truth image, while it is dark blue in the narrowed gamut image. Various types of degradation including spatial and color degradation are mixed in image (c). Visualization ofcolor gamut graph which compares m

Overall Architecture ofproposed TUT. Our TUT consists oftwo-stage Transforimer-based U-Nets First stage ofU-Net(GUN) 1Sresponsible for gray-scale imagerestoration and second stage ofU-Net(CUN)tat cares color-scale image restoration. After the input image in RGB color space 1S converted into YCbCr colot space, it is separated into a gray-scale channel (Y) and color-scale channels (Cb, Cr). For each

Detailed components ofresidual block (RB). Instead ofutilizing other complicated layers, with this small number ofconvolutional layers, RB can derive better performance than using spatial-wise Transformer. Moreover, using RB takes upa big advantage in terms ofmemory and computation.

Diagram ofstrip-wise self-attention. Strip-wise self-attention ofStripformer [27] consists ofIntra-self- attention and Inter-self-attention with each corresponding horizontal/vertical line. Intra-self-attention (Intra-SA- H/Intra-SA-W) considers correlation within horizontal/vertical lines. Inter-self-attention (Inter-SA-I[/Inter-SA- W) considers correlation between horizontal/vertical lines.

Diagram of spatial-wise Transformer block (STB). Our STB consists of not only strip-wise self- attention layer [27] but also spatial-wise modulator (SM) to overcome a disadvantage ofstrip-wise self-attention mechanism thus maximize performance.

Diagram of channel-wise Transformer block (CTB). Our CTB consists of not only transposed self attention layer [28] but also channel-wise modulator (CM)to overcome a disadvantage ofmulti-head transposec self-attention mechanism thus maximize performance.

Qualitative comparison on evaluation datasets. PSNR and SSIM were measured on RGBcolorspace. RED: Best performance, BLUE: Second Best Performance

Ablation study on validation set. PSNR was measured on DIV2K [4] validation set (#100).

Qualitative comparison on the 100thimage ofthe DIV2K 4] validation set. Best viewed in zoom.

Qualitative comparison on the25thimage ofthe DIV2K [4] validation set. Best viewed in zoom.

Qualitative comparison on the2ndimage ofthe Urban100 [62] test set. Best viewed in zoom.

Qualitative comparison on the 5thimage ofthe Set5 [57] test set. Best viewed in zoom.

Comparison analysis of computational cost and performance. MACs were measured on a 256x256 color-scale image. PSNR was measured with DIV2K [4] validation set. RED: Best performance, BLUE: Second Best Performance