한국과학기술원 도서관

서지주요정보
(An) FPGA-based preprocessing system for GPU-partitioned machine learning inference server = GPU 분할 기술이 적용된 머신러닝 추론 서버를 위한 FPGA 기반의 전처리 시스템
서명 / 저자	(An) FPGA-based preprocessing system for GPU-partitioned machine learning inference server = GPU 분할 기술이 적용된 머신러닝 추론 서버를 위한 FPGA 기반의 전처리 시스템 / Gwangoo Yeo.
발행사항	[대전 : 한국과학기술원, 2024].
Online Access	비공개원문

소장정보

등록번호

8042167

소장위치/청구기호

학술문화관(도서관)2층 학위논문

MEE 24055

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

In machine learning inference servers, unlike training servers, inference requests are irregularly allocated and must be completed within a limited time. Consequently, operations are performed with small batch sizes, leading to inefficient utilization of GPU resources. Recent advancements in GPUs provide partitioning technology, allowing the efficient use of GPU resources by dividing a single hardware resource into independent hardware of suitable sizes for users. As this technology is implemented in inference servers, there is an increase in the processing capacity and resource utilization of GPUs. However, this leads to a bottleneck in the preprocessing stage on the CPU associated with inference requests. In this thesis research, an analysis of the bottleneck points of preprocessing stage in GPU-partitioned machine learning inference server is conducted, and proposes FPGA-based hardware design to offload the data preprocessing to increase the overall processing throughput of the ML inference server.

머신 러닝 추론 서버에서는 학습 서버에서와 다르게 추론 요청이 불규칙적으로 할당되며, 제한된 시간 내에 연산을 완료해야 한다. 이로 인해 작은 배치 크기로 연산을 수행하며, 그래픽처리장치(GPU)의 자원을 효율적으로 활용하지 못한다. 최근 그래픽처리장치에서는 단일 하드웨어 자원을 사용자에게 적합한 크기의 독립적인 하드웨어로 분할하여 효율적으로 자원을 사용할 수 있는 분할 기술을 제공한다. 해당 기술이 사용된 추론 서버에서 그래픽처리장치의 처리량 및 자원 활용도가 증가하면서, 추론 요청에 수반되는 중앙처리장치(CPU)에서의 전처리 과정이 병목현상을 유발한다. 본 학위 연구에서는 그래픽처리장치 분할 기술이 적용된 머신 러닝 추론 서버에서의 전처리 과정 병목점을 분석하고, FPGA를 활용하여 전체 머신 러닝 서버의 처리량을 가속하는 시스템을 제안한다.

서지기타정보

서지기타정보
청구기호	{MEE 24055
형태사항	iv, 25 p. : 삽도 ; 30 cm
언어	영어
일반주기	저자명의 한글표기 : 여관구 지도교수의 영문표기 : Minsoo Rhu 지도교수의 한글표기 : 유민수 Including appendix
학위논문	학위논문(석사) - 한국과학기술원 : 전기및전자공학부,
서지주기	References : p. 21-24
주제	CPU GPU FPGA Inference server GPU partitioning technique 중앙처리장치 그래픽처리장치 FPGA 머신러닝 추론 서버 그래픽처리장치 분할기술

QR CODE

책소개

전체보기

나의 도서관정보

메뉴

소장정보

리뷰정보

초록정보

서지기타정보

책소개

목차

이 주제의 인기대출도서