서지주요정보
3D Graphics rendering processor using out-of-order memory access = 비순차적 메모리 접근을 이용한 삼차원 그래픽스 렌더링 프로세서
서명 / 저자 3D Graphics rendering processor using out-of-order memory access = 비순차적 메모리 접근을 이용한 삼차원 그래픽스 렌더링 프로세서 / Dong-Hyun Kim.
발행사항 [대전 : 한국과학기술원, 2002].
Online Access 원문보기 원문인쇄

소장정보

등록번호

8012993

소장위치/청구기호

학술문화관(문화관) 보존서고

MEE 02016

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

As the 3D graphics demand grows, the hardware accelerator dedicated to 3D graphics algorithms becomes popular. Recently, commodity graphics processors have rendered hundreds of million pixels, but more improvement is still required to render photo-realistic scenes at real-time rates. The factor to limit the growth of 3D graphics rendering performance is memory bandwidth, because there is no temporal locality to make cache effective in 3D graphics pipeline. In this paper, approach to increase efficiency of memory bandwidth as is proposed. Precharging time and row-activating time is much longer than data cycle time in DRAM-based memory. Since address scheduling chunks the addresses in the same row, it reduces the turn-around cycle of DRAM. Polices adequate to graphics system are suggested and analyzed in this paper. Out-of-order fragment processing architecture is also proposed with out-of-order memory access. In graphics system, the data ready first can be processed first, owing to rare dependency between fragments. Out of order fragment processing does not enforce the fragment whose all data are fetched on waiting to be served even if earlier fragments are not processed. This requires no reorder buffer and shortens the amount of in-flight fragments to be stored in order to avoid pipeline stall. Additionally, dedicated texel fetch unit architecture is suggested in this paper. Close pixels in screen a lso have the very high locality in texture space owing to mip-map structure. Texel size can be almost same to pixel size by selecting adequate level-of-detail, and some texture filtering algorithms need many texels adjacent to footprint center. Therefore the several same texel may be needed by neighboring pixel. Fetching redundant texel requests at once reduces cache access and makes one port cache operate just as multi-port cache. A texel fetch unit combines arrived requests with pending requests if their addresses are equal. A texel cache must not stall requests after a miss. This is possible by inserting request queue between tag and cache. It gives cache-prefetching effect. The architecture using these schemes is proposed in this paper. There are four pixel processors that take charge of four pixel pipelines. These pixel processors have their responsible pixel region, and only render pixels in the region interleaved granularly for load balancing. The frame buffer is divided into four memory modules in the same way so that one pixel processor accesses one memory module. Similarly, there are eight texel fetch units. Every memory accesses from pixel processors and texel fetch units are sent to memory controllers, and they select best operations to maximize effective memory bandwidth by looking up to SDRAM states. Simulation shows that effective DDR SDRAM memory bandwidth becomes 85% of peak bandwidth and it gives 640 Mpixels/sec fill rate. Dedicated texture fetching unit saves the memory bandwidth and gives 5.4 texels per cycle into one pixel processor on the average, hence 3Gtexels/sec available.

서지기타정보

서지기타정보
청구기호 {MEE 02016
형태사항 v, 54, [4] p. : 삽화 ; 26 cm
언어 영어
일반주기 저자명의 한글표기 : 김동현
지도교수의 영문표기 : Lee-Sup Kim
지도교수의 한글표기 : 김이섭
학위논문 학위논문(석사) - 한국과학기술원 : 전기및전자공학전공,
서지주기 Includes reference
QR CODE

책소개

전체보기

목차

전체보기

이 주제의 인기대출도서