서지주요정보
한국어 문서에서 개체명 인식에 관한 연구 = Study on named entity recognition in Korean text
서명 / 저자 한국어 문서에서 개체명 인식에 관한 연구 = Study on named entity recognition in Korean text / 이경희.
발행사항 [대전 : 한국과학기술원, 2000].
Online Access 원문보기 원문인쇄

소장정보

등록번호

8010579

소장위치/청구기호

학술문화관(문화관) 보존서고

MCS 00040

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

등록번호

9006471

소장위치/청구기호

서울 학위논문 서가

MCS 00040 c. 2

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

In information extraction systems, it is essential to identify named entities in order to provide the knowledge to be extracted. However, it is not easy to identify these names, because they involve unknown words, and hence the strategy of listing candidates would not work. Also, it is sometimes hard to determine the category of named entities, like distinguishing a person name from a company name. In English, several rule-based researches show good performance using special dictionaries and limited contextual information. But Korean has no character type information like English. As a result, it is difficult to recognize candidates of named entities and determine the boundary of them. This thesis proposes the rule-based method for recognizing named entities, especially person names, organization names or locations in Korean texts. The method uses various dictionaries for proper names, prefix, suffix, verb's subcategorization, etc. and it consists of 4 stages according to the type of information used. At the first stage, the information inside an ejeol is used. At the next stage, limited contextual information about surrounding words is considered. At the third stage, subcategorization information of verbs is used to disambiguate the categories of named entities. At the last stage, information between named entities is used for merging named entities recognized in previous stages. Various experiments have conducted and the contribution of each stage was evaluated. The experimental result shows 90.4% in precision and 83.4% in recall for Korean news articles.

서지기타정보

서지기타정보
청구기호 {MCS 00040
형태사항 iii, 37 p. : 삽화 ; 26 cm
언어 한국어
일반주기 저자명의 영문표기 : Kyung-Hee Lee
지도교수의 한글표기 : 김길창
지도교수의 영문표기 : Gil-Chang Kim
학위논문 학위논문(석사) - 한국과학기술원 : 전산학전공,
서지주기 참고문헌 : p. 35-37
QR CODE

책소개

전체보기

목차

전체보기

이 주제의 인기대출도서