With the rapid development of World Wide Web (WWW) technology, the amount of information available grows in an unprecedented manner. Collaborative filtering is one of the most successful recommender system technologies for reducing such information overload. Collaborative filtering is a technique that uses the known preferences of group of users to predict the unknown preference of a new user. It is based on the assumption that if two users rate certain items similarly, they share similar tastes, and hence will rate other items similarly. Collaborative filtering has an advantage in that it provides support for filtering items such as movies and pictures which are hard to analyze by automated processes. On the other hand, if a new user or item is added, the existing collaborative filtering techniques need too much time for updating user-related information. To avoid this problem, Principal Component Analysis (PCA) and Singular Value Decomposition (SVD) method have been introduced. These two methods are used to reduce the dimensionality of the recommender system database.
In this thesis, the prediction ability of collaborative filtering using PCA and SVD are compared in terms of Mean Absolute Error (MAE) and computation complexity for MovieLens Data. For collaborative filtering using PCA, the K-means clustering method is newly adopted and the Recursive Rectangular Clustering using averages is developed. Computational results indicate that PCA using K-means clustering algorithm yields a better result than the other PCA methods. Furthermore, PCA using K-means clustering algorithm shows about the same performance as SVD in terms of prediction accuracy. In terms of computational complexity, the PCA methods generally requires less amount of computation than the SVD method. In summary, PCA with K-means clustering may be recommended if a user can be guided to evaluate all the items in the gauge set. Otherwise, the SVD method can be used as an alternative.