한국과학기술원 도서관

서지주요정보
Deep learning-based solutions for empowering visual localization and other vision tasks = 강력한 시각적 위치 파악 및 기타 컴퓨터 비전 문제를 지원하는 딥 러닝 기반 솔루션
서명 / 저자	Deep learning-based solutions for empowering visual localization and other vision tasks = 강력한 시각적 위치 파악 및 기타 컴퓨터 비전 문제를 지원하는 딥 러닝 기반 솔루션 / Praveen Kumar Rajendran.
발행사항	[대전 : 한국과학기술원, 2023].
Online Access	원문보기 원문인쇄

소장정보

등록번호

8040585

소장위치/청구기호

학술문화관(도서관)2층 학위논문

MPD 23009

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

Visual localization is essential for many applications, including AR/VR, robots, and self-driving cars. Traditional methods use large memory and processing resources to estimate the camera position in absolute and relative terms. It gave rise to a new pattern of finding the pose using learning-based methods, i.e., pose regressors. Existing relative camera pose estimation techniques rely solely on balancing hyperparameter tuning manually or automatically in the loss function. On the other hand, current absolute pose regressors generally lack the quality to adapt to different domains of the same scene. In this work, we primarily address these two issues. First, estimating the relative camera position between a pair of images is formulated using a two-stage training strategy that eliminates the need for compensating hyperparameters in the loss function. Our proposed training strategy drastically improved the translation vector estimation by 16.11%, 28.88%, and 52.27% on the KingsCollege, OldHospital, and StMarysChurch scenes, respectively. To demonstrate texture invariance, we explore the generalization of the proposed method by extending the datasets to different scene styles for ablation and qualitative studies using Generative Adversarial Networks(GAN). Second, we offer a novel lightweight domain adaptive training framework to retrain any existing absolute pose regressors(APR) to improve their generalization capability. Our lightweight network outperforms the transformer in translation vector estimation on the visual localization benchmark dataset. The results show that despite using about 24 times fewer FLOPs, 12 times fewer activations, and five times fewer parameters than state-of-the-art MS-Transformer, our approach outperforms all CNN-based architectures and achieves comparable performance to transformer-based architectures. Our method achieves ranks 2nd and 4th with the Cambridge Landmarks and 7Scenes datasets, respectively. Moreover, our approach outperforms and ranks 1st over the MS -transformer on unseen domains. Furthermore, This work explores the demonstration of an APR's inversion for synthesizing views similar to NeRF.

시각적 위치 추정은 AR/VR, 로봇 및 자율 주행 자동차를 포함한 많은 애플리케이션에 필수적이다. 전통적인 방법은 절대적이고 상대적인 관점에서 카메라 위치를 추정하기 위해 대용량 메모리와 처리 리소스를 사용한다. 그것은 학습 기반 방법, 즉 포즈 회귀기를 사용하여 포즈를 찾는 새로운 패턴을 낳았다. 기존의 상대적인 카메라 포즈 추정 기술은 손실 함수에서 수동 또는 자동으로 하이퍼 파라미터 튜닝의 균형을 맞추는 데에만 의존한다. 반면, 현재의 절대적인 포즈 회귀기는 일반적으로 동일한 장면의 다른 영역에 적용하기 위한 품질이 부족하다. 본 연구에서는 아래 2가지 문제를 주로 다룬다. 첫째, 손실 함수에서 하이퍼 파라미터를 보상할 필요가 없는 2단계 훈련 전략을 사용하여 한 쌍의 이미지 사이의 상대적인 카메라 위치를 추정한다. 우리가 제안한 훈련 전략은 KingsCollege, OldHospital, and StMarysChurch 장면에서 트렌스레이션 벡터 추정치를 각각 16.11%, 28.88%, 52.27%씩 획기적으로 개선했다. 텍스처 불변성을 입증하기 위해, GAN을 사용한 ablation 및 qualitative 연구를 통해 데이터 세트를 다른 장면 스타일로 확장하여 제안된 방법의 일반화를 탐구한다. 둘째, 기존의 절대적 포즈 회귀자를 재훈련하여 일반화 능력을 향상시키는 새로운 경량 도메인 적응 훈련 프레임워크를 제공한다. 제안된 경량 네트워크는 시각적 위치 추정 벤치마크 데이터 셋의 translation 벡터 추정에서 트랜스포머를 능가한다. 결과는 최첨단 MS-트랜스포머보다 약 24배 적은 FLOP, 12배 적은 활성화, 5배 적은 매개 변수를 사용했음에도 불구하고, 제안된 접근 방식은 모든 CNN 기반 아키텍처를 능가하고 트랜스포머 기반 아키텍처와 비슷한 성능을 달성한다는 것을 보여준다. 제안된 방법은 캠브리지 랜드마크와 7개의 장면 데이터 셋으로 각각 2위와 4위를 달성한다. 또한, 제안된 방식은 학습에 관여하지 않은 도메인에서 MS-트랜스포머보다 성능이 뛰어나고 1위를 차지한다. 또한, 본 논문은 NeRF와 유사한 관점을 합성하기 위한 APR의 반전 시연을 탐구한다.

서지기타정보

서지기타정보
청구기호	{MPD 23009
형태사항	vii, 71 p. : 삽도 ; 30 cm
언어	영어
일반주기	저자명의 한글표기 : Rajendran Praveen Kumar 지도교수의 영문표기 : Dongsoo Har 지도교수의 한글표기 : 하동수 Including appendix
학위논문	학위논문(석사) - 한국과학기술원 : 미래자동차학제전공,
서지주기	References : p. 59-69
주제	Visual localization Camera pose Relative pose estimation Absolute pose estimation Domain adaptation 시각적 현지화 카메라 포즈 상대 포즈 추정 절대 포즈 추정 도메인 적응

QR CODE

책소개

전체보기

나의 도서관정보

메뉴

소장정보

리뷰정보

초록정보

서지기타정보

책소개

목차

이 주제의 인기대출도서