한국과학기술원 도서관

서지주요정보
Development of dimension reduction algorithms for collaborative filtering = 협업적 필터링을 위한 차원감소 알고리즘의 개발
서명 / 저자	Development of dimension reduction algorithms for collaborative filtering = 협업적 필터링을 위한 차원감소 알고리즘의 개발 / Do-Hyun Kim.
발행사항	[대전 : 한국과학기술원, 2007].
Online Access	원문보기 원문인쇄

소장정보

등록번호

8017985

소장위치/청구기호

학술문화관(문화관) 보존서고

DIE 07001

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

Collaborative filtering (CF) is one of the most popular recommender system technologies, and utilizes the known preferences of a group of users to predict the unknown preference of a new user. However, the existing CF techniques has the drawback that it requires the entire existing data be maintained and analyzed repeatedly whenever new user ratings are added. To avoid such a problem and improve computational efficiency, this thesis develops three dimensionality reduction algorithms for CF. To overcome the disadvantage of the existing CF techniques, a new approach called Eigentaste was proposed based on the principal component analysis (PCA). However, Eigentaste requires that each user rate every item in the so called gauge set for executing PCA, which may not be always feasible in practice. Developed in the first study in this thesis is an iterative PCA approach in which no gauge set is required. The developed approach simultaneously estimates the missing values and determines the PC’s using SVD. The PC values of users in the reduced dimension are then used for clustering users. The developed approach and Eigentaste, combined with two clustering methods, are compared in terms of the mean absolute error (MAE) of prediction using three real data sets. Computational results indicate that the prediction accuracy of the proposed approach does not deteriorate even without a gauge set, and therefore, the proposed approach may be considered as a useful alternative when it is neither possible nor practical to define a gauge set. The iterative PCA approach using SVD takes a considerable amount of time and space to estimate a new user’s missing ratings. To alleviate this problem, two SVD update methods, the Zha and Simon and the folding-in methods, are considered as possible alternatives in the second study. These alternatives are compared in terms of both the MAE and computational time using a real data set. The experimental results show that the SVD update method by Zha and Simon is better than the folding-in method. Since the ratings data for CF reflecting the many-sided interests of many users could have nonlinear dependencies, a dimension reduction technique based on nonlinear PCA is developed in the third study. The proposed method can also update the local PC’s and the local mean in each cluster whenever a new user enters the system, and then, the updated local PC’s and the local mean are used for the next new users. The experimental results reveal that this LLPCA approach has a decreasing MAE as the number of new users increase, due to the updating of the local PC’s and the local mean. This is a desirable result since it implies that the developed approach can be applied, even if a large number of new users enter the system continuously. Finally, the three approaches developed are compared in terms of the MAE and computational time. From the experiments in which the prediction accuracy and computational time are used as the comparative criterion, it is concluded that the method by Zha and Simon is best as a dimension reduction technique for CF. In addition, the LLPCA method would be reasonable to apply to the system in which a new user enters continuously. The CF techniques developed in this thesis enable one to predict missing ratings of new users in real time. As such, the approaches can be applied to a variety of e-commerce sites.

본 논문에서는 추천시스템의 하나인 협업적 필터링을 위한 차원감소 기법을 다루고 있다. 우선 협업적 필터링은 과거에 아이템을 선택하는데 있어 유사한 성향을 보였던 사용자들은 다른 아이템에 대해서도 유사한 성향을 보일 것이라는 가정을 바탕으로 추천을 수행하는 시스템이다. 그러나 기존의 협업적 필터링은 새로운 사용자가 등장할 때마다 새로운 사용자의 선호도 점수를 예측하기 위해 기존 고객들의 모든 선호도 점수를 이용해야 한다는 문제점을 안고 있다. 이러한 협업적 필터링의 문제점을 극복하기 위해 주성분 분석을 바탕으로 한 Eigentaste라는 방법이 Goldberg에 의해 제안되었다. 그러나 Eigentaste는 모든 고객들이 일정한 아이템 즉 gauge items에 대해서 강제적으로 평가를 해야 한다는 것을 전제로 하고 있다. 그러나 이것은 현실적으로 어려운 점이 많다. 따라서 첫 번째 연구에서는 고객들이 특정 아이템들에 대해서 강제적으로 평가를 할 필요가 없는 반복적 주성분 분석(iterative PCA)에 기반한 추천시스템을 개발하였다. 개발된 방법은 특이값분해(Singular Value Decomposition)을 이용하여 결측치 예측과 주성분 도출을 동시에 가능하도록 하였으며, 도출된 각 고객들의 주성분 값을 바탕으로 클러스터링을 수행하여 선호도 예측을 수행하게 된다. 개발된 방법은 세 가지의 실제 데이터를 바탕으로 실험을 실시하였고, 그 결과 개발된 방법이 기존의 Eigentaste 방법보다 예측의 정확성 관점에서 좋은 결과를 보여주었다. 따라서 개발된 방법의 경우, gauge items을 필요로 하지 않으면서도 예측의 정확성 관점에서 좋은 결과를 보여주었으므로, Eigentaste 방법의 적절한 대안이 될 수 있을 것이다. 그런데, 앞서 설명된 iterative PCA에 기반한 추천시스템의 경우, 새로운 고객의 선호도를 예측하기 위해 많은 시간을 필요로 한다. 이런 문제를 해결하기 위해, 협업적 필터링 업데이트(SVD Update)방법을 도입하였다. 이를 위해 대표적인 두 가지 협업적 필터링 업데이트 방법 즉 Zha 와 Simon이 개발한 방법과 folding-in인 방법을 도입하여 실시간적으로 추천이 이루어질 수 있도록 하였다. 예측의 정확성과 계산시간의 관점에서 두 가지 방법을 비교하였을 때, Zha 와 Simon이 개발한 방법이 folding-in보다 더 좋은 결과를 보여주는 것을 확인할 수 있었다. 또한 협업적 필터링에서 다루는 데이터의 경우, 다양한 고객의 취향을 반영하는 만큼 비선형성을 가질 수 있다. 따라서 세 번째 연구에서는 비선형 주성분 분석에 근거한 차원감소 방법을 협업적 필터링에 도입하였다. 또한 제안된 방법은 새로운 고객이 들어올 때마다 축약된 기존 고객 정보의 업데이트가 가능할 수 있도록 개발되었다. 그리고 개발된 알고리즘은 실제 데이터를 이용하여 실험을 실시하였다. 이를 통해 본 연구에서 개발된 방법이 새로운 고객이 추가될수록 예측의 정확도가 높아지는 것을 알 수 있다. 따라서 끊임없이 신규 고객들이 들어오는 상황에서 유용한 추천시스템이 될 수 있음을 확인할 수 있었다. 마지막으로 본 연구에서 개발된 세 가지 방법에 대해 예측의 정확성과 계산시간의 관점에서 비교분석하였다. 실험결과로부터 Zha 와 Simon 에 의해 개발된 특이값 분해 업데이트 방법이 협업적 필터링을 위한 가장 우수한 차원감소 방법임을 확인할 수 있었으며, 비선형 주성분 분석을 바탕으로 한 방법의 경우, 새로운 고객이 끊임없이 들어오는 상황에 가장 적합한 방법임을 확인할 수 있었다. 본 논문에서 개발된 협업적 필터링을 통해 신규고객의 선호도를 실시간적으로 예측할 수 있고, 이를 다양한 환경으로의 적용 및 확장이 가능하다. 예를 들어, 기존의 전자상거래 사이트뿐만 아니라 고객이 판매자와 소비자로 참여하는 경매사이트에도 적용될 수 있다.

서지기타정보

서지기타정보
청구기호	{DIE 07001
형태사항	vii, 85 p. : 삽화 ; 26 cm
언어	영어
일반주기	저자명의 한글표기 : 김도현 지도교수의 영문표기 : Bong-Jin Yum 지도교수의 한글표기 : 염봉진 수록잡지명 : "Collaborative filtering based on iterative principal component analysis". Expert systems with applications, 28(4), 823-830(2005)
학위논문	학위논문(박사) - 한국과학기술원 : 산업공학과,
서지주기	Reference : p. 82-85

QR CODE

책소개

전체보기

나의 도서관정보

메뉴

소장정보

리뷰정보

초록정보

서지기타정보

책소개

목차

이 주제의 인기대출도서