서지주요정보
범주형 클러스터링을 위한 내재된 차이 기반의 부동성 척도 = Latent difference-based dissimilarity metric for categorical clustering
서명 / 저자 범주형 클러스터링을 위한 내재된 차이 기반의 부동성 척도 = Latent difference-based dissimilarity metric for categorical clustering / 박민호.
저자명 박민호 ; Park, Min-Ho
발행사항 [대전 : 한국과학기술원, 2003].
Online Access 원문보기 원문인쇄

소장정보

등록번호

8014205

소장위치/청구기호

학술문화관(문화관) 보존서고

MCS 03018

SMS전송

도서상태

이용가능

대출가능

반납예정일

초록정보

Clustering, in data mining, is the useful tool to discover hidden patterns residing in the given dataset. Many clustering algorithms have been proposed for numeric data clustering, but only a few works were suggested for categorical data clustering while many relational databases contain categorical datasets. In this paper, we suggest a new dissimilarity metric for categorical datasets. The former binary style dissimilarity metrics often fail to measure the similarity between data objects correctly when synonymous choices or unimportant attributes exist. Our dissimilarity metric solves this shortcoming by computing how far two different choices are based on the whole dataset information. To show the effectiveness of our dissimilarity metric, we conducted experiments on both synthetic and real categorical datasets. When our dissimilarity metric was applied to traditional clustering algorithms, their clustering results were greatly improved than those with previous metrics. Moreover, the performance of a hierarchical algorithm was even better than that of famous clustering algorithms dedicated to categorical datasets.

서지기타정보

서지기타정보
청구기호 {MCS 03018
형태사항 iii, 33 p. : 삽도 ; 26 cm
언어 한국어
일반주기 저자명의 영문표기 : Min-Ho Park
지도교수의 한글표기 : 이윤준
지도교수의 영문표기 : Yoon-Joon Lee
학위논문 학위논문(석사) - 한국과학기술원 : 전산학전공,
서지주기 참고문헌 수록
주제 클러스터링
범주형 데이터
Clustering
categorical data
QR CODE qr code