한국과학기술원 도서관

서지주요정보
(A) heterogeneous computing-in-memory and neural-processing-unit architecture for an energy efficient floating-point DNN acceleration = 에너지 효율적인 부동 소수점 연산을 위한 메모리 내 컴퓨팅-심층 신경망 가속기 이기종 아키텍쳐
서명 / 저자	(A) heterogeneous computing-in-memory and neural-processing-unit architecture for an energy efficient floating-point DNN acceleration = 에너지 효율적인 부동 소수점 연산을 위한 메모리 내 컴퓨팅-심층 신경망 가속기 이기종 아키텍쳐 / Wonhoon Park.
발행사항	[대전 : 한국과학기술원, 2023].
Online Access	비공개원문

소장정보

등록번호

8041288

소장위치/청구기호

학술문화관(도서관)2층 학위논문

MEE 23123

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

his work presents an energy-efficient digital-based computing-in-memory (CIM) processor to support floatingpoint (FP) deep neural network (DNN) acceleration. Previous FP-CIM processors have two limitations. Processors with post-alignment shows low throughput due to serial operation, and the other processor with prealignment incurs truncation error. To resolve these problems, we focus on the statistics that outlier exists according to shift amount in pre-alignment-based FP operation. As those outlier decreases energy efficiency due to long operation cycles, it needs to be processed separately. The proposed Hetero-FP-CIM integrates both CIM arrays and shared NPU, so they compute both dense inlier and sparse outlier respectively. It also includes efficient weight caching system to avoid entire weight copy in shared NPU. The proposed Hetero-FP-CIM is simulated in 28 nm CMOS technology and occupies 2.7 mm$^2$. As a result, it achieves 5.99 TOPS/W at ImageNet (ResNet50) with bfloat16 representation.

본 논문은 심층 신경망 (DNN)의 부동소수점 연산을 위한 메모리 내 컴퓨팅 (CIM)-뉴럴 프로세싱 유닛 (NPU)의 이기종 프로세서를 제안한다. 부동소수점 연산의 데이터 정렬 문제로 인하여, 기존 CIM 프로세서는 (1) 다중 데이터를 순차적으로 처리하여 낮은 연산 처리량, (2) 다중 데이터의 정렬 전 처리 과정에서 데이터 누락의 문제가 있다. 본 논문은 전체 데이터를 두 가지 (Inlier/Outlier)로 분류하며, 높은 에너지 효율로 Inlier를 처리하는 CIM과 CIM의 연산 효율을 저하하는 Outlier를 에너지 효율적으로 처리하는 NPU으로 구성된 이기종 연산 프로세서를 제안한다. 또한, 이기종 연산기 사이에 발생하는 가중치 중복, 처리 속도 불균형을 캐싱 시스템을 도입하여 해결한다. 제안된 프로세서는 28nm CMOS 기술로 시뮬레이션 되었으며, 2.7mm$^2$을 차지한다. 결과적으로 5.99 TFLOPS/W 에너지 효율을 ImageNet (ResNet-50)에서 달성한다.

서지기타정보

서지기타정보
청구기호	{MEE 23123
형태사항	iii, 17 p. : 삽도 ; 30 cm
언어	영어
일반주기	저자명의 한글표기 : 박원훈 지도교수의 영문표기 : Hoi-jun Yoo 지도교수의 한글표기 : 유회준 Including appendix
학위논문	학위논문(석사) - 한국과학기술원 : 전기및전자공학부,
서지주기	References : p. 15
주제	Computing-in-memory SRAM Deep neural network Floating-point Cache system Outlier-handling Heterogeneous processor 메모리 내 컴퓨팅 SRAM 심층 신경망 부동 소수점 캐시 시스템 이기종 연산 가속기

QR CODE

책소개

전체보기

나의 도서관정보

메뉴

소장정보

리뷰정보

초록정보

서지기타정보

책소개

목차

이 주제의 인기대출도서