서지주요정보
Adaptive information retrieval for automatically indexed queries and documents = 자동으로 색인된 질의와 문서에 적응하는 정보 검색
서명 / 저자 Adaptive information retrieval for automatically indexed queries and documents = 자동으로 색인된 질의와 문서에 적응하는 정보 검색 / Won-Yong Kim.
발행사항 [대전 : 한국과학기술원, 1998].
Online Access 원문보기 원문인쇄

소장정보

등록번호

8009246

소장위치/청구기호

학술문화관(문화관) 보존서고

DCS 98013

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

등록번호

9005072

소장위치/청구기호

서울 학위논문 서가

DCS 98013 c.2

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

There are three problems in searching for relevant documents such as noiseness of descriptors, vocabulary gap between documents and a given query, and different importance of query descriptors. The previous probabilistic retrieval models rank documents, considering only the different importance of query descriptors. They ignore the other problems because it is difficult to obtain knowledge appropriate to a particular application, and to use the knowledge correctly in reducing the three problems. At first, this thesis proposes a general ranking function which can correctly handle the three problems. By the way, the function is too complex for a practical information retrieval system to utilize it for effective and efficient document retrieval. The general ranking function is simplified substantially under the assumption of certainty indexing, i.e., binary indexing. The complexity of the simplified ranking function is reduced further by Faithful User Assumption (FUA) that a relevant document has all the concepts represented by a query. A learning method to reduce the three problems is derived formally from FUA. Each time retrieval results are available, it updates the knowledge on importance of query descriptors and relationships between query descriptors and other descriptors. Noise descriptors are also defined in this thesis. The retrieval by the simplified ranking function and the proposed learning method is called Faithful User Retrieval (FUR) under certainty indexing. The effect of the incrementally constructed knowledge and noise query descriptors is investigated through experiments in FUR under certainty indexing and in the previous probabilistic ranking model BIR. When it is not impossible to obtain the distributions of query descriptors in relevant documents for past queries, the retrieval effectiveness of FUR is comparable to that of BIR. If the distributions become available, both of them improve the performance. The degree of improvement of FUR is greater about 10% than that of BIR in most document collections. The experimental results also show that many of query descriptors extracted automatically from natural language queries are independent of retrieval quality. In other words, they are noise in retrieval purpose. They only increase query processing time without supporting effective document ranking. In order to obtain further improvements in retrieval effectiveness, the vocabulary gap problem is looked into. Query descriptors are expanded with their analogues to alleviate the problem. The analogues of a descriptor identified by the proposed learning method reflect the contexts in which the descriptor has appeared. Since a broad query descriptor relates to a lot of contexts, the expansion with its analogues may make the query cover many contexts different to the user's intention. Reversely to analogues of broad query descriptors, those of narrow query descriptors may clarify the contexts of the query because a narrow query descriptor occurs in highly correlated contexts. Hence, the analogues for only narrow query descriptors can improve retrieval effectiveness further, which is proved experimentally. Uncertainty indexing estimates for each descriptor in a document a probability of correct indexing that a human being attaches this descriptor to the document. A ranking function and a learning method suitable for uncertainty indexing have been developed. The ranking function is another simple version of the general ranking function, and the learning method developed for certainty indexing is modified for uncertainty indexing. The retrieval based on both of them is called FUR under uncertainty indexing, The experimental results show the superiority of FUR under uncertainty indexing to BIR and FUR under certainty indexing.

서지기타정보

서지기타정보
청구기호 {DCS 98013
형태사항 [108] p. : 삽화 ; 26 cm
언어 영어
일반주기 저자명의 한글표기 : 김원용
지도교수의 영문표기 : Yoon-Joon Lee
지도교수의 한글표기 : 이윤준
수록잡지명 : "Probabilistic Retrieval Incorporating the Relationships of Descriptors Incrementally". Information Processing and Management
학위논문 학위논문(박사) - 한국과학기술원 : 전산학과,
서지주기 Reference : p. 101-108
QR CODE

책소개

전체보기

목차

전체보기

이 주제의 인기대출도서