한국과학기술원 도서관

서지주요정보
(An) energy-effcient deep neural network acceleration by exploiting cross-layer weight scaling and bit-level data sharing = 교차 계층 가중치 상호 변환 및 비트 수준 데이터 공유를 활용한 에너지 효율적 심층 신경망 가속
서명 / 저자	(An) energy-effcient deep neural network acceleration by exploiting cross-layer weight scaling and bit-level data sharing = 교차 계층 가중치 상호 변환 및 비트 수준 데이터 공유를 활용한 에너지 효율적 심층 신경망 가속 / Youngbeom Jung.
발행사항	[대전 : 한국과학기술원, 2021].
Online Access	원문보기 원문인쇄

소장정보

등록번호

8037889

소장위치/청구기호

학술문화관(문화관) 보존서고

DEE 21096

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

Various accelerators have been proposed to efficiently execute deep neural networks in mobile devices with severe energy and area limitations. However, these accelerators mostly use fixed-point representations for energy-efficient inferencing and are optimized for higher-level vision tasks such as classification and recognition. In this dissertation, overcoming these limitations to perform various tasks energy-efficiently with minimal overhead in existing accelerators is proposed. To this end, when performing diverse tasks of the deep neural network, two methods are used to minimize access to external memory, which is the primary energy consumption source of the accelerator. A brief introduction to both ways follows. 1. Reformation of deep neural networks considering quantization error: Most deep neural networks trained and distributed by suppliers use floating-point representation for high prediction accuracy. However, embedded accelerators use a fixed-point representation for energy-efficient operation. Therefore, a quantization process that converts to a low-precision fixed-point representation while maintaining high accuracy is indispensable. In this dissertation, deep neural network modification methods are proposed to make robust from quantization errors by predicting and analyzing errors in the quantization process. 2. Data compression using spatial correlation: To reduce the amount of external memory access data used by the built-in accelerator, most of them use compression methods that take advantage of the activation function's sparsity to support parallel operations. However, depending on the deep neural network and the type of data, there are cases in which sufficient sparsity to use the existing compression method cannot be provided. This dissertation proposes compression methods that can effectively reduce communication even for deep neural networks and data possing low sparsity by using spatial correlation. In summary, this dissertation proposes energy-efficient executing strategies for performing various tasks on embedded accelerators. The comprehensive goal is to improve the deep neural-network accelerator's energy efficiency. The main goal is to effectively reduce the amount of communication with external memory during execution with negligible overhead. To this subject, the data characteristics of deep neural networks and the limitations of accelerators are analyzed, and the proposed methods solve the raised problem.

최근 들어 에너지와 면적의 제한이 심한 모바일 장비에서 심층 신경망을 효율적으로 실행하기 위한 다양한 가속기들이 제한되었다. 하지만, 이 가속기들은 에너지 효율적인 추론을 위하여 대부분 고정 소수점 표현을 사용하며 또한 분류나 인식과 같은 상위 레벨의 비젼 업무에 최적화 되었다. 본 학위논문에서는 이러한 제한을 극복하여 기존의 가속기들에서 최소한의 오버헤드로 다양한 업무를 에너지 효율적으로 수행하기 위한 방법을 제안하고자 한다. 이를 위하여 심층 신경망의 다양한 업무를 수행할 때 가속기의 주 에너지 소비원인 외부 메모리 접근을 최소화하기 위한 두 가지 방법을 사용하고자 한다. 두 가지 방법에 대한 간략한 소개는 다음과 같다. 1. 양자화의 에러를 고려한 심층 신경망 변경: 공급자에 의해 학습하여 배포되는 대부분의 심층신경망은 높은 예측 정확도를 위하여 부동 소수점 표현을 사용한다. 하지만, 내장 가속기들은 에너지 효율적인 동작을 위하여 고정 소수점 표현을 사용한다. 따라서, 높은 정확도를 유지하면서 낮은 정밀도의 고정 소수점 표현으로 변환하는 양자화 과정은 필수불가결하다. 본 논문에서는 양자화 과정에서 발생하는 에러를 예측하고 이를 분석하여, 양자화 에러에 강인한 심층 신경망 변경 방법을 제안한다. 2. 공간적 상관관계를 이용한 데이터 압축: 내장 가속기에서 사용하는 외부 메모리 접근 데이터량을 줄이기 위한 대부분은 병렬적 연산을 지원하기 위하여 활성화 함수에 의해 발생하는 희소성을 활용한 압축 방법을 사용한다. 하지만, 심층 신경망과 데이터의 종류에 따라 기존의 압축방법을 사용하기 위한 충분한 희소성 제공하지 못하는 경우가 존재한다. 본 논문에서는 공간적 상관관계를 이용하여 낮은 희소성을 보이는 심층 신경망과 데이터에 대해서도 외부 메모리와 통신을 효과적으로 감소시킬 수 있는 압축 방법을 제안한다. 정리하면, 본 학위논문에서는 심층 신경망을 가속기에서 다양한 업무를 에너지 효율적으로 수행하기 위한 방법을 제안한다. 포괄적 목표는 심층 신경망 가속기의 효용성 증진이고, 중점적 목표는 적은 오버헤드로 수행 중 발생하는 외부 메모리와의 통신량을 효과적으로 감소시키는 것이다. 심층 신경망의 데이터 특성과 가속기들의 제한을 분석 ∙ 제기하고, 제안하는 방법에 의해 이 문제를 해결 할 수 있음을 보이고자 한다.

서지기타정보

서지기타정보
청구기호	{DEE 21096
형태사항	iv, 59 p. : 삽화 ; 30 cm
언어	영어
일반주기	저자명의 한글표기 : 정영범 지도교수의 영문표기 : Lee-Sup Kim 지도교수의 한글표기 : 김이섭 Including Appendix
학위논문	학위논문(박사) - 한국과학기술원 : 전기및전자공학부,
서지주기	References : p. 53-58

QR CODE

책소개

전체보기

나의 도서관정보

메뉴

소장정보

리뷰정보

초록정보

서지기타정보

책소개

목차

이 주제의 인기대출도서