The purpose of automatic language identification(LID) system is identifying the language of speech utterences in many languages. As the world community is being globalized, the need for multi-language processing systems expands in many areas and the importance of LID as a key technology also increases. Since LID researches were started, many researchers have been trying to implement better LID system. The most important area of these researches was extracting useful and efficient information from spoken languages that had a discriminability from languages to languages and modeling such information. At present, the state-of-the-art approach of LID is using phone recognizers and phonotactic language model. We can find that some good results were achieved by using this kind of approach in many researches. But, we also can find some problems in this approach. One of the most serious problem is that the performance of the phone recognizer is poor. It causes many obstacles on improving the performance of LID system
In this thesis, two methods to compensate for such problem are suggested. The one is to make phonotactic language model for just vowels and to combine with original phonotactic language model. The other is to make tables for weighting values by using information theory and to use this tables when log-liklihood is calculated upon language models. Vowel phonotactic language model is just made with vowels extracted from phone strings out of phone recognizer. The way to make vowel language model is the same with other language models except that only vowels are used. Weighting values using information theory is applied just for the purpose of increasing discriminability between language models. Through some experiments, these two methods are proved to be appropriate for improving the performance of overall LID systems. From this research, we can find that one of the most important thing to improve the system performance is raising the reliability of the result of phone recognition.