한국과학기술원 도서관

서지주요정보
Summarizing distribution : submodular probability density cover = 서브모듈러 확률 밀도 커버를 이용한 분포의 요약
서명 / 저자	Summarizing distribution : submodular probability density cover = 서브모듈러 확률 밀도 커버를 이용한 분포의 요약 / Janghoon Cho.
발행사항	[대전 : 한국과학기술원, 2018].
Online Access	원문보기 원문인쇄

소장정보

등록번호

8032434

소장위치/청구기호

학술문화관(문화관) 보존서고

DEE 18005

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

This paper considers the problem of summarizing a probability measure with a small subset of diverse samples such that representation quality defined by the probability density cover function is maximized. Many existing studies have addressed this problem based on submodularity; however, there does not seem to be a definitive measure for summarization. A probability density cover function which is both monotone and submodular is defined for measuring the sample quality in terms of coverage. The cover function is a generalization of other submodular functions such as the facility location, sum coverage, and truncated vertex cover. Maximizing the cover function is achieved by the lazy greedy algorithm, which guarantees a lower bound as a constant ratio of the optimal value. Simulation results show that the algorithms to maximize the probability density cover function to identify sample subset of high diversity and relevance can perform better than random sampling in terms of fidelity in representing the probability measure and of estimation accuracy of the moments while achieving high diversity and relevance. The proposed cover function is also applied to the data subset selection task. Experimental results show that the GMM and k-NN-based classifiers learned with the data subset selected by the proposed algorithm on TIDIGIT and MNIST datasets has better accuracy than the existing methods.

본 논문에서는 다양한 표본의 작은 부분 집합으로 확률 분포를 요약하는 문제를 다루며, 해당 표본들이 확률 분포를 얼마나 잘 대표하고 있는지를 측정하는 함수로 확률 밀도 커버 함수를 제안한다. 기존의 연구들 대부분은 서브모듈러성에 기반하여 이 문제를 다루고 있지만 요약을 위한 확실한 척도를 명쾌하게 제안하지 못하고 있다. 제안하는 단조 서브모듈러한 확률 밀도 커버 함수는 추출된 표본들의 품질을 커버리지 측면에서 측정하기 위해 정의 된다. 이 함수는 설비 위치, 합계 커버리지 및 절삭된 버텍스 커버와 같은 다른 서브모듈러 함수들의 일반화라는 것을 증명하였다. 이 함수를 최대화 시키는 것은 간단한 탐욕 알고리즘을 통해 이루어지며, 이는 최적값의 일정한 비율에 해당하는 하한값을 보장한다. 시뮬레이션 결과는 제안된 알고리즘으로 추출된 표본들이 높은 다양성과 관련성을 달성하면서 확률 밀도 표현과 모멘트 예측 정확성 측면에서 무작위로 추출된 표본들보다 우월한 성능을 나타내고 있음을 보여준다. 또한 제안된 커버 함수는 데이터 부분집합 선택 문제에도 적용되어 보았다. 실험 결과는 TIDIGIT와 MNIST 데이터셋에서 제안하는 함수로 선택되어진 데이터의 부분집합으로 학습된 가우시안 혼합 모델과 k 최근접 이웃 기반의 분류기가 기존의 함수들에 비해서 더 좋은 분류 성능을 나타내고 있음을 보여준다.

서지기타정보

서지기타정보
청구기호	{DEE 18005
형태사항	v, 53 p. : 삽화 ; 30 cm
언어	영어
일반주기	저자명의 한글표기 : 조장훈 지도교수의 영문표기 : Chang Dong Yoo 지도교수의 한글표기 : 유창동
학위논문	학위논문(박사) - 한국과학기술원 : 전기및전자공학부,
서지주기	References : p. 46-49

QR CODE

책소개

전체보기

나의 도서관정보

메뉴

소장정보

리뷰정보

초록정보

서지기타정보

책소개

목차

이 주제의 인기대출도서