Collaborative filtering (CF) is the process of predicting a user’s interest in various items, such as books or movies, based on taste information, typically expressed in the form of item ratings, from many other users. One of the key issues in collaborative filtering is how to deal with data sparsity; most users rate only a small number of items.This paper’s first contribution is a distance measure. This distance measure is probability-based and is adapted for use with sparse data; it can be used with for instance a nearest neighbor method, or in graph-based methods to label the edges of the graph. Our second contribution is a novel probabilistic graph-based collaborative filtering algorithm called PGBCF that employs that distance. By propagating probabilistic predictions through the user graph, PGBCF does not only use ratings of direct neighbors, but can also exploit the information available for indirect neighbors. Experiments show that both the adapted distance measure and the graph-based collaborative filtering algorithm lead to more accurate predictions.
展开▼