한국과학기술원 도서관

서지주요정보
Processing in-memory architecture for binary neural networks = 바이너리 신경망을 위한 메모리 내부 처리 아키텍처 연구
서명 / 저자	Processing in-memory architecture for binary neural networks = 바이너리 신경망을 위한 메모리 내부 처리 아키텍처 연구 / Hyeonuk Kim.
발행사항	[대전 : 한국과학기술원, 2021].
Online Access	원문보기 원문인쇄

소장정보

등록번호

8037878

소장위치/청구기호

학술문화관(문화관) 보존서고

DEE 21085

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

Popular deep learning technologies suffer from memory bottlenecks, which significantly degrade the energy-efficiency, especially in mobile environments. Processing in-memory (PIM) for binary neural networks (BNNs) has emerged as a promising solution to mitigate such bottlenecks, and various relevant works have been presented accordingly. However, their performances are severely limited by the overheads induced by modifying the conventional memory architectures. To alleviate the performance degradation, this dissertation proposes NAND-Net, an efficient architecture to minimize the computational complexity of PIM architecture for BNNs. Based on the observation that BNNs contain many redundancies, each convolution is decomposed into sub-convolutions, and the unnecessary operations are eliminated. In the remaining convolution, each binary multiplication (bitwise XNOR) is replaced by a bitwise NAND operation, which can be implemented without any bit cell modifications. This NAND operation further brings an opportunity to simplify the subsequent binary accumulations (popcounts). The operation cost of those popcounts is reduced by exploiting the data patterns of the NAND outputs. Compared to the prior state-of-the-art designs, NAND-Net achieves 1.04-2.4x speedup and 34-59% energy saving. Meanwhile, the efficiency of the mixed-signal manner of PIM architecture is mostly compromised by the tremendous cost of domain conversion circuits such as analog-to-digital converters (ADCs). This dissertation identifies the root causes of the need for such ADCs and proposes novel solutions to address them. First, the BNN algorithm is decomposed and reconstructed to become perfectly suitable for the crossbar arrays in PIM. This recombination minimizes redundant operations in the BNN, reducing the number of crossbar arrays with ADCs to be accumulated by half. Moreover, the dynamic range of bit-line currents in each crossbar array is decreased by exploiting the inter-layer dependency of BNNs. Appropriately handling the partial-sum current distribution can make it possible to perform BNN processing without domain conversions, bypassing the need for ADCs completely. The experimental results show that our proposed architecture achieves 3.44x speedup and 91.5% energy saving, thereby visualizing the usability of the PIM architecture for BNNs in various applications.

최신 딥러닝 기술은 메모리 병목현상에 의해 성능이 제한되며, 특히 모바일 환경에서는 낮은 에너지 효율성으로 인해 실용적인 활용이 어렵다. 이를 완화하기 위해, 바이너리 신경망과 메모리 내부 처리 구조가 해결책으로 떠오르고 있으며, 이 두 가지를 결합한 형태의 아키텍처가 제시되고 있다. 하지만, 관련된 사전연구들은 기존 메모리 구조를 수정하는 것에 의해 발생하는 오버헤드 때문에 효율성이 크게 떨어진다. 메모리 내부 처리 구조는 메모리의 원래 목적인 데이터 읽기 및 쓰기 연산의 효율성을 그대로 유지하면서 컴퓨팅 연산 처리 성능을 최대로 끌어올리는 것이 관건이다. 사전연구에서는 바이너리 신경망 연산을 메모리에 그대로 매핑하며, 메모리 구조의 변경을 비교적 관대하게 수용하는 방식을 취하였다. 그래서 메모리 셀 밀도와 레이턴시, 에너지 효율성 측면에서 큰 오버헤드가 발생하며, 구현된 컴퓨팅 연산의 효율성이 떨어진다. 따라서 본 학위논문에서는 바이너리 신경망을 위한 메모리 내부 처리 아키텍처를 최적화하기 위한 테크닉들을 제안하고자 한다. 먼저, 바이너리 신경망을 메모리 내부 처리 구조에 적합하게 만들기 위해 연산 복잡성을 최소화하는 NAND-Net을 제안한다. 바이너리 신경망의 연산에 많은 중복성이 존재한다는 관찰에 따라 각각의 컨볼루션 연산들을 분리하여 불필요한 연산들을 제거하였으며, 이에 따라 XNOR 기반의 곱셈 연산이 NAND 기반으로 대체되어 메모리 셀 구조의 변경 없이 구현될 수 있다. 또한, NAND 연산의 출력값의 데이터 패턴을 활용하여 팝카운트 기반의 축적 연산의 오버헤드를 절반으로 감소시켰다. 사전연구들과 비교했을 때, NAND-Net은 최대 2.4배의 속도 향상과 59%의 에너지 소모량 감소를 보여준다. 더 나아가 연산 성능을 극대화시키기 위해, 혼성신호 방식의 메모리 내부 처리 구조의 성능 병목점인 아날로그-디지털 변환회로를 통한 도메인 변환을 우회하는 테크닉을 제안한다. 도메인 변환이 필요한 근본적인 원인을 2가지로 분석하며, 각 원인에 대한 솔루션을 제시한다. 먼저, 바이너리 신경망 연산을 재구성하여 연산에 필요한 메모리 셀 배열의 개수를 절반으로 줄임으로써 아날로그-디지털 변환회로의 개수 또한 절반으로 줄인다. 또한, 레이어간 연산 의존성을 활용하여 각 메모리 셀 배열의 비트라인 전류의 크기를 줄이고 전류값의 분포를 적절히 조절함으로써 도메인 변환을 완전히 우회할 수 있는 방법을 제시한다. 이에 따라 아날로그-디지털 변환회로를 제거함으로써 3.44배의 속도 향상과 91.5%의 에너지 소모량 감소를 보여준다. 제안한 테크닉들은 바이너리 신경망과 메모리 내부 처리 구조를 결합하는 것에 대한 오버헤드를 최소화하여 다양한 어플리케이션에서의 사용가능성을 가시화한다.

서지기타정보

서지기타정보
청구기호	{DEE 21085
형태사항	v, 56 p. : 삽화 ; 30 cm
언어	영어
일반주기	저자명의 한글표기 : 김현욱 지도교수의 영문표기 : Lee-Sup Kim 지도교수의 한글표기 : 김이섭 Including Appendix
학위논문	학위논문(박사) - 한국과학기술원 : 전기및전자공학부,
서지주기	References : p. 49-54

QR CODE

책소개

전체보기

나의 도서관정보

메뉴

소장정보

리뷰정보

초록정보

서지기타정보

책소개

목차

이 주제의 인기대출도서