서지주요정보
문서 범주화에서 연어를 기반으로 한 문서 표현 = Text representation based on collocation in text categorization
서명 / 저자 문서 범주화에서 연어를 기반으로 한 문서 표현 = Text representation based on collocation in text categorization / 장병규.
발행사항 [대전 : 한국과학기술원, 1997].
Online Access 원문보기 원문인쇄

소장정보

등록번호

8007857

소장위치/청구기호

학술문화관(문화관) 보존서고

MCS 97039

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

등록번호

9003353

소장위치/청구기호

서울 학위논문 서가

MCS 97039 c. 2

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

The way in which texts are represented is a crucial influence on the effectiveness of systems for text categorization which is the classification of documents with respect to a set of predefined categories, but attempts to produce better text representations mostly have been unsuccessful. The lack of success of attempts to produce more effective text representations arises in part because most previous feature set to represent texts such as single-term and syntactic phrase have no consideration about the predefined categories. This thesis presents a text representation method using collocation which is recurrent combinations of words that co-occur more often than expected by chance and that correspond to arbitrary word usages. Collocations are cohexive lexical clusters and category-dependent, that is, extracted differently from different cateogries. However, pure collocation as a feature set causes too many features. To resolve this problem without losing good properties of collcation, We suggest a clustered collocation considering very similar collocations into one feature. The method using clusters of collocation as a feature set for text representation showed better results than using single terms. Especially, it was more effective when it was difficult to discriminate the categories.

서지기타정보

서지기타정보
청구기호 {MCS 97039
형태사항 iv, 40 p. : 삽화 ; 26 cm
언어 한국어
일반주기 저자명의 영문표기 : Byung-Gyu Chang
지도교수의 한글표기 : 김길창
지도교수의 영문표기 : Gil-Chang Kim
학위논문 학위논문(석사) - 한국과학기술원 : 전산학과,
서지주기 참고문헌 : p. 37-40
QR CODE

책소개

전체보기

목차

전체보기

이 주제의 인기대출도서