서지주요정보
An adaptive bio-workflow scheduling system and its performance evaluation in cloud = 클라우드에서의 적응형 바이오 워크플로우 스케쥴링 방식과 성능 분석
서명 / 저자 An adaptive bio-workflow scheduling system and its performance evaluation in cloud = 클라우드에서의 적응형 바이오 워크플로우 스케쥴링 방식과 성능 분석 / Jagdorj Tumurpurev.
저자명 Jagdorj Tumurpurev ; T. Jagdorj
발행사항 [대전 : 한국과학기술원, 2012].
Online Access 원문보기 원문인쇄

소장정보

등록번호

8023772

소장위치/청구기호

학술문화관(문화관) 보존서고

MICE 12003

휴대폰 전송

도서상태

이용가능

대출가능

반납예정일

초록정보

Abstract Cloud computing technologies have made it possible to analyze big data sets in scalable computing infrastructure. In many scientific fields, such as bioinformatics and astronomy, applications are composed of complex workflow tasks and generate huge amounts of data continuously, which require large space of storage as well as high-speed computing resource. DNA sequence analysis, where very large data sets are now generated at reduced cost using the Next-Generation Sequencing (NGS) methods, is an area which can greatly benefit from cloud-based infrastructures. It has become major challenge for these datasets to transfer, storage, and analysis. Even though many approaches have been proposed in distributed solutions, however, they focus on static scheduling with batch processing scheme in local computing farm and data storage. In the case of a large scale workflow system, it is essential and valuable to outsource entire or a part of their tasks to public clouds for reducing resource cost. However, transferring huge datasets between these potential resources and local node became a major challenge. Reducing transfer time as well as unbalanced completion time of different problem size is very important for making overall process faster. In this thesis, we discuss current issues on resource provisioning, scheduling and computing model in the area of distributed environment, study on relevant approaches to solve them, also we propose an adaptive workflow scheduling scheme, including run-time data distribution and collection service for reducing the data transfer time. The proposed scheme optimizes the allocation ratio of computing elements to the different datasets in order to minimize the total makespan under resource constraints. We present an initial implementation and evaluation of this approach for Workflow Management System (WMS) composed of well-known sequence alignment algorithm and finally, experiment results show that our proposed scheme is promising.

Abstract Cloud computing technologies have made it possible to analyze big data sets in scalable computing infrastructure. In many scientific fields, such as bioinformatics and astronomy, applications are composed of complex workflow tasks and generate huge amounts of data continuously, which require large space of storage as well as high-speed computing resource. DNA sequence analysis, where very large data sets are now generated at reduced cost using the Next-Generation Sequencing (NGS) methods, is an area which can greatly benefit from cloud-based infrastructures. It has become major challenge for these datasets to transfer, storage, and analysis. Even though many approaches have been proposed in distributed solutions, however, they focus on static scheduling with batch processing scheme in local computing farm and data storage. In the case of a large scale workflow system, it is essential and valuable to outsource entire or a part of their tasks to public clouds for reducing resource cost. However, transferring huge datasets between these potential resources and local node became a major challenge. Reducing transfer time as well as unbalanced completion time of different problem size is very important for making overall process faster. In this thesis, we discuss current issues on resource provisioning, scheduling and computing model in the area of distributed environment, study on relevant approaches to solve them, also we propose an adaptive workflow scheduling scheme, including run-time data distribution and collection service for reducing the data transfer time. The proposed scheme optimizes the allocation ratio of computing elements to the different datasets in order to minimize the total makespan under resource constraints. We present an initial implementation and evaluation of this approach for Workflow Management System (WMS) composed of well-known sequence alignment algorithm and finally, experiment results show that our proposed scheme is promising.

서지기타정보

서지기타정보
청구기호 {MICE 12003
형태사항 vii, 49 p. : 삽도 ; 30 cm
언어 영어
일반주기 저자명의 한글표기 : T. Jagdorj
지도교수의 영문표기 : Chan-Hyun Youn
지도교수의 한글표기 : 윤찬현
including appendix
학위논문 학위논문(석사) - 한국과학기술원 : 정보통신공학과,
서지주기 References : p. 44-45
주제 Cloud computing
DNA sequence analysis
Workflow Scheduling
Resource Provisioning
Stream Processing
Cloud computing
DNA sequence analysis
Workflow Scheduling
Resource Provisioning
Stream Processing
QR CODE qr code