서지주요정보
D-TensoRF : tensorial radiance fields for dynamic scenes = 움직임이 있는 장면을 모델링하기 위한 텐서 라디언스 필드 연구
서명 / 저자 D-TensoRF : tensorial radiance fields for dynamic scenes = 움직임이 있는 장면을 모델링하기 위한 텐서 라디언스 필드 연구 / Hankyu Jang.
발행사항 [대전 : 한국과학기술원, 2023].
Online Access 원문보기 원문인쇄

소장정보

등록번호

8040847

소장위치/청구기호

학술문화관(문화관)B1층 보존서고

MCS 23032

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

Neural radiance field (NeRF) attracts attention as a promising approach to reconstructing the 3D scene. As NeRF emerges, subsequent studies have been conducted to model dynamic scenes, which include motions or topological changes. However, most of them use an additional deformation network, slowing down the training and rendering speed. Tensorial radiance field (TensoRF) recently shows its potential for fast, high-quality reconstruction of static scenes with compact model size. In this paper, we present D-TensoRF, a tensorial radiance field for dynamic scenes, enabling novel view synthesis at a specific time. We consider the radiance field of a dynamic scene as a 5D tensor. The 5D tensor represents a 4D grid in which each axis corresponds to X, Y, Z, and time and has 1D multi-channel features per element. Similar to TensoRF, we decompose the grid either into rank-one vector components (CP decomposition) or low-rank matrix components (newly proposed MM decomposition). We also use smoothing regularization to reflect the relationship between features at different times (temporal dependency). We conduct extensive evaluations to analyze our models. We show that D-TensoRF with CP decomposition and MM decomposition both have short training times and significantly low memory footprints with quantitatively and qualitatively competitive rendering results in comparison to the state-of-the-art methods in 3D dynamic scene modeling.

뉴럴 라디언스 필드가 유망한 3차원 복원기술로 주목을 받으면서, 이를 이용하여 동적 장면을 모델링하는 방법들이 연구가 되었다. 하지만 기존 방법들은 움직임을 예측하기 위해 추가적인 네트워크를 사용했기 때문에 모델의 훈련 및 렌더링 속도가 느렸다. 텐서 라디언스 필드는 명시적 데이터 구조와 텐서 분해를 통해 정적 장면을 빠르고 적은 메모리로 모델링하는 잠재력을 보여주었다. 이 논문에서는 동적 장면을 모델링하기 위한 텐서 라디안스 필드를 제시한다. 4차원 그리드와 1차원 특징들로 이루어진 5차원 텐서를 활용해 동적 장면을 표현하고 이를 CP 분해와 새롭게 제안한 MM 분해를 통해 이를 분해한다. 또한, 시간의 흐름에 따라 변화하는 장면 사이의 연관성을 반영하기 위한 평활 정규화도 제안한다. 실험을 통해 제안된 방안을 분석하고, 이것이 적은 메모리와 짧은 훈련 시간으로 동적인 장면을 모델링하고 고품질의 이미지를 합성해내는 것을 보였다.

서지기타정보

서지기타정보
청구기호 {MCS 23032
형태사항 v, 31 p. : 삽도 ; 30 cm
언어 영어
일반주기 저자명의 한글표기 : 장한규
지도교수의 영문표기 : Daeyoung Kim
지도교수의 한글표기 : 김대영
학위논문 학위논문(석사) - 한국과학기술원 : 전산학부,
서지주기 References : p. 27-29
주제 Neural radiance field
Dynamic scene modeling
Tensor decomposition
뉴럴 라이언스 필드
동적 장면 모델링
텐서 분해
QR CODE

책소개

전체보기

목차

전체보기

이 주제의 인기대출도서

An overview of the data collection process. One(Left) uses many cameras simultaneously to getimages from diverse viewing directions. The other(Right) uses a single camera multiple times while changing the location. Note that the object may move between the times the picture is taken.

Sample results of our D-TensoRF. Our framework enables us to synthesize images at a novel time and viewing direction. The three left columns are the rendered images at novel times, and the three right are the results from unseen viewing directions.

An overview ofneural radiance field (NeRF) scene representation. After sampling 5D coor- dinates (position x and viewing direction d) along camera rays, an MLP(Fe) converts those coordinates into a color C and volume density O. Then the NeRF synthesizes the image with C,0 using differentiable volume rendering(Eqn. 3.1).

An overview of tensorial radiance field(TensoRF) scene representation. TensoR. describes the scene with the 3D feature grids, Gg,Gc· Trilinearly interpolated feature values corresponding to the position x are computed from the grids. The appearance feature vector Gc(x) and viewing directiond are fed to the function s to produce the emitted color C. The geometry feature value Go(x) is directly used

We model the dynamic radiance field as two 4D grids, Gg,Gc. The 4D grids have X, Y,Z, and time axes. While an element ofGo is a single-channel feature, an element ofGeis a multi-channel feature. Therefore, Go is a 4D tensor, and Gc is a 5D tensor with an additional mode for feature dimension.

Tensor decomposition of4D tensor. While CP decomposition factorizes the 4D tensor into a summation ofouter products of vector factors, MMdecomposition factorizes the tensor into a summation of outer products of matrix factors(Eqn.4.3).

Smoothing regularization for models with CP decomposition(Left, Eqn. 4.6) and MM decomposition(Right, Eqn. 4.7). Smoothing regularization is applied while moving the predefined size of window along the vector factor or the specific mode of matrix factor corresponding to the time axis.

Overall framework of D-TensoRF with MM decomposition

Evaluation scores of baseline methods are taken from their papers whenever available. Training time is estimated on a single RTX 3090 GPU.

Per-scene quantitative evaluation on the extended Synthetic Nerf dataset provided by D- NeRF 6.

Additional rendering results ofD-TensoRF with CP decomposition(Left) and MM decompo- sition(Right). D-TensoRF-CT produces a higher PSNR score than D-TensoRF-MM, but D-TensoRF-CF tends to produce blurry results compared to D-TensoRF-MM

Qualitative results ofD-TensoRF-CP, D-TensoRF-MM and basline methods on the extende Synthetic NeRF scenes provided by D-NeRF [6].

Evaluation scores ofDD-TensoRF-CF and D-TensoRF-MM with different number of compo nents and final voxels. 643 X Nt 1003 X Nt 1503 X Nt #Comp PSNR1 SSIM- LPIPSL PSNR+ SSIM1 LPIPSL PSNR1 SSIM1 LPIPS,

Comparison oftraining speed and memory footprint in various settings. Training speed and memory footprint increases while the number of components and voxels increases. 643 X Nt 1003 X Nt 1503 X Nt #Comp Time. Size(MB). Timel Size(MB). Time. Size(MB).

Ablation study on smoothing regularization in terms of evaluation scores. Method PSNR1 SSIM+ LPIF [ensoRF-CP768(1503 X Nt, w/0 smoothing regularization) 28.91 0.95 0.0

Synthesized images of D-TensoRF MM192 with different numbers of voxels

Ablation study on smoothing regularization in terms of rendering quality. The aboveimages are the results when we apply the smoothing regularization. The below images are the results of not applying the smoothing regularization.

Rendering results of D-TensoRF-CP. Images of each scene are synthesized under differen time conditions with the same viewing direction. The eight scenes are provided by D-NeRF [6].

Rendering results ofD-TensoRF-MM. Images ofeach scene are synthesized under differen time conditions with the same viewing direction. The eight scenes are provided by D-NeRF [6].

Rendering results of D-TensoRF-CP Images of each scene are synthesized under differen viewing directions with the same time condition. The eight scenes are provided by D-NeRF [6].

Rendering results of D-TensoRF-MM Images ofeach scene are synthesized under differen viewing directions with the same time condition. The eight scenes are provided by D-NeRF [6].