One possible approach to web personalization is to mine typical user profiles from the vast amount of historical data stored in access logs. Clustering techniques have been used to automatically discover typical user profiles recently. But it is a challenging problem to design effective similarity measure between the session vectors which are usually high dimensional and sparse. A new approach based on non-negative matrix factorization (NMF) is presented. We apply non-negative matrix factorization to dimensionality reduction of the session-URL matrix, and the projecting vectors of the user session vectors are clustered into typical user session profiles using the spherical k-means algorithm. The results of experiment show that our algorithm can mine interesting user profiles effectively.
展开▼