Word sense disambiguation is to determine which of the senses of an ambiguous word is invoked in a particular use of the word, and this technique is useful to improve accuracy of machine translation or information retrieval. In order to training systems for word sense disambiguation, various language resources has been applied.
This thesis is to study about word sense disambiguation for korean noun using contextual information. A contextual information which is extracted from context - a bag of nouns, verbs and adjectives - is constructed to a sense vector with SVD (Singular Valued Decomposition), and the axes of the vector are the words appearing in the training data. comparing cosine similarity between the vectors from training data and the vectors from the test data, the answer of the test data is selected.
This thesis also proposes methods of using hypernym information extracted dictionary and thesaurus. With simple patterns of sense definitions and morphological features of Korean dictionary, we can extract hypernym information of Korean noun. In the process of applying hypernym information from thesaurus, we try to solve the problems of reading hypernym ambiguity and applying hypernym information.
Throughout this process, contextual information is changed into the more useful and structural information., our method leads up 60% accuracy.