서지주요정보
문단단위를 이용한 효과적인 문서범주화 = An effective text categorization method using passages
서명 / 저자 문단단위를 이용한 효과적인 문서범주화 = An effective text categorization method using passages / 김진숙.
발행사항 [대전 : 한국과학기술원, 2002].
Online Access 원문보기 원문인쇄

소장정보

등록번호

8013568

소장위치/청구기호

학술문화관(문화관) 보존서고

MCS 02046

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

Though the automated text categorization into topical categories has a long history, dating back to 1960s, it`s target documents have been confined to short texts such as abstracts and newswire. However increasing lengths of documents in full-text collections and World-Wide Web carries out renewed interests in classifying long documents into proper categories. This thesis proposes a new text categorization model, passage-based automated text categorization. Contrary to the passage-based text categorization model, traditional text categorization systems can be called as document-based text categorization systems since past researches on automated text categorization used a whole document as a categorization unit. However, the passage-based automated text categorization model divides the test document into passages and uses them as categorization units. By merging the resulting categories for the passages, test document`s categories can be reconstructed. Experiments were conducted with passages based on overlapping fixed-length windows. Applying the passage-based text categorization model to longer documents in subsets of Reuters-21578 text categorization test collection on the top of kNN(k Nearest Neighbo) classifier, there was significant increases in categorization efficiency. This implies that passage-based text categorization can be used as a categorization method for full-text collections.

서지기타정보

서지기타정보
청구기호 {MCS 02046
형태사항 ii, 46 p. : 삽화 ; 26 cm
언어 한국어
일반주기 저자명의 영문표기 : Jin-Suk Kim
지도교수의 한글표기 : 김명호
지도교수의 영문표기 : Myoung-Ho Kim
학위논문 학위논문(석사) - 한국과학기술원 : 전산학전공,
서지주기 참고문헌 : p. 44-46
QR CODE

책소개

전체보기

목차

전체보기

이 주제의 인기대출도서