서지주요정보
자질 빈도의 분포 정보를 이용한 단어 의미 분별 = Word sense disambiguation using distribution information of feature frequency
서명 / 저자 자질 빈도의 분포 정보를 이용한 단어 의미 분별 = Word sense disambiguation using distribution information of feature frequency / 오은정.
발행사항 [대전 : 한국과학기술원, 2004].
Online Access 원문보기 원문인쇄

소장정보

등록번호

8015273

소장위치/청구기호

학술문화관(문화관) 보존서고

MCS 04027

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

This thesis proposes a method to disambiguate the senses of ambiguous words by using distribution information of feature frequency. Recently, corpus-based methods, especially supervised learning methods, are much studied for Word Sense Disambiguation(WSD). The corpus-based methods which display a good performance learn the senses of words by using features extracted from the context of ambiguous words. They use the frequency of features straightly for weighting features with assumption that all features come out independently. However, if we know that the frequency of features shows a certain probability distribution, we can use distribution information for WSD. We assume that the frequency of features shows the 2-Poisson distribution, the distribution for the correct sense and the distribution for the incorrect senses. Then we apply distribution information to the feature weighting function. Because of difficulity to estimate parameters for 2-Poisson distribution, however, we use the simplified feature weighting function. To apply 2-Poisson distribution to word sense disambiguation, we take each distribution information of features according to the topical feature and the local feature respectively. We complete the feature weighting function which considers the normalization of feature. Finally, we add feature location information by giving more weight to features extracted from same sentence of an ambiguous word. As the result of experiments, we know that distribution information of feature frequency is useful in word sense disambiguaion. Compared with other systems which don`t use distribution information of feature frequency, the proposed method shows the highest result.

서지기타정보

서지기타정보
청구기호 {MCS 04027
형태사항 vi, 50 p. : 삽화 ; 26 cm
언어 한국어
일반주기 저자명의 영문표기 : Eun-Jung Oh
지도교수의 한글표기 : 김길창
지도교수의 영문표기 : Gil-Chang Kim
학위논문 학위논문(석사) - 한국과학기술원 : 전산학전공,
서지주기 참고문헌 : p. 47-50
QR CODE

책소개

전체보기

목차

전체보기

이 주제의 인기대출도서