Topic evolution aims to analyze topical changes in the sequentially organized documents, requiring topic modeling methods to extract topics from the document collection. The traditional approaches based on the commonly observed word profiles are unable to distinguish changes in the representative terms of a topic and changes in the topic itself as the same word similarity measures both, causing the topic evolution to be ineffective at analyzing correlations and interconnections between different topics. The author proposes a novel approach to topic evolution by introducing alternative topic models, defining topics as the shared research topics of the common interest author (CIA) groups generated from multitudes of shared research activities, granting topic models the adaptability to different research patterns. The introduction of CIA groups results in the proposed topic models to associate author groups to each topic, allowing topic comparisons through the authors instead of words. Capturing the proportional author transitions between CIA groups over time equates to the identification of topic flow over time, allowing the topic evolution based on the proposed topic models to incorporate topic correlation analysis in the form of merge and split detection. Bibliographic records from the Microsoft Academic Graph dataset is used to showcase the eligibility of the proposed topic evolution approach. The result indicates that the proposed alternative topic models are capable of successfully model coherent topics with only the metadata of the document set. Connecting topics through the CIA group captures complex evolutionary events such as merge and split between topics non-adjacent in the timeline representing the gradual evolution of topics. Summation of such events charts a map of interconnected topical evolutions distinctive to the topic models, allowing long-term topic evolution analysis in the given research field with multiple perspectives.
토픽 에볼루션은 시계열 문서 집합에서 토픽 모델링 방법으로 추출한 토픽들이 어떠한 진화를 보이는가 관측하는 것을 목표로 한다. 기존 토픽 모델링 방법론에 기반한 토픽 에볼루션은 토픽 자체의 변화와 토픽 내용물의 변화를 동일한 단어 유사도로 파악하기 때문에 두 변화의 구별이 힘들며 주어진 토픽의 내용이 어떻게 바뀌어가는가에 대한 분석에 그치는 한계가 있는데 반하여, 본 학위논문에서는 관심 공유 저자들을 이용한 토픽 모델링을 제시하고 이를 사용하여 기존에 가능하던 내용 변화 분석에 더해 서로 다른 토픽들 간의 연관성을 관측의 가능성을 예시를 통해 제시하였다.