首页> 外文期刊>Journal of Organizational Computing and Electronic Commerce >Application of the Information Bottleneck method to discover user profiles in a Web store
【24h】

Application of the Information Bottleneck method to discover user profiles in a Web store

机译:信息瓶颈方法在网络商店中发现用户概要文件的应用

获取原文
获取原文并翻译 | 示例
       

摘要

The paper deals with the problem of discovering groups of Web users with similar behavioral patterns on an e-commerce site. We introduce a novel approach to the unsupervised classification of user sessions, based on session attributes related to the user click-stream behavior, to gain insight into characteristics of various user profiles. The approach uses the agglomerative Information Bottleneck (IB) algorithm. Based on log data for a real online store, efficiency of the approach in terms of its ability to differentiate between buying and non-buying sessions was validated, indicating some possible practical applications of the our method. Experiments performed for a number of session samples showed that the method is capable of separating both types of sessions to a large extent. A detailed analysis was performed for the number of clusters ranging from two to seven, and the results were compared to those achieved by applying the most common clustering algorithm, k-means. Increasing the number of clusters generally leads to better results for both algorithms. However, IB demonstrated much higher average efficiency than k-means for the corresponding number of clusters, and this superiority was especially clear for lower number of clusters. The IB-based division of user sessions into seven clusters gives the mean entropy value of 0.28, which means the 95% separation of sessions of both types. Furthermore, a big advantage of our approach is that it gives a possibility to analyze the probability distribution of session attributes in individual clusters, which allows one to discover hidden knowledge about common characteristics of various user profiles and use this knowledge to support managerial decisions.
机译:本文涉及在电子商务站点上发现具有类似行为模式的Web用户组的问题。我们基于与用户点击流行为相关的会话属性,针对用户会话的无监督分类引入了一种新颖的方法,以深入了解各种用户配置文件的特征。该方法使用了聚集信息瓶颈(IB)算法。基于真实在线商店的日志数据,验证了该方法在区分购买和非购买时段方面的效率,这表明了该方法的一些实际应用。对多个会话样本进行的实验表明,该方法能够在很大程度上分离这两种类型的会话。对范围从2到7的聚类数量进行了详细的分析,并将结果与​​通过应用最常见的聚类算法k均值获得的结果进行了比较。群集数量的增加通常会为两种算法带来更好的结果。但是,对于相应数量的集群,IB的平均效率要比k均值高得多,而对于较低数量的集群,这种优势尤为明显。基于IB的用户会话划分为七个集群,平均熵值为0.28,这意味着两种类型的会话之间的95%分离。此外,我们方法的一大优势在于,它可以分析各个集群中会话属性的概率分布,从而使人们可以发现有关各种用户配置文件共同特征的隐藏知识,并使用该知识来支持管理决策。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号