首页> 外文期刊>International journal of multidisciplina >REFINEMENT OF CLUSTERS BASED ON DISSIMILARITY MEASURES
【24h】

REFINEMENT OF CLUSTERS BASED ON DISSIMILARITY MEASURES

机译:基于异类度量的集群细化

获取原文
获取原文并翻译 | 示例
       

摘要

Session clustering is one of the ways to improve web site structure, providing recommendations and web personalization implemented by creating a pool of web pages from similar sessions within a cluster. Session clustering depends on the various similarity/dissimilarity measures used to compare the sessions. It is one of the most sought research problem in Web Usage Mining these days. In this paper we propose a refinement technique for session clusters with the purpose of solving the problem of scalability of log data for reducing the domain of recommendations to the end user. This technique considers the page access, access time and session weight dissimilarity within a cluster for refinement. We also present an analysis of different available dissimilarity measures like Simple Difference, Jaccard, Variance, Pattern Difference and BLWMN based on the above defined features for evaluating dissimilarity. The results show that clusters generated by refinement using a combination of web page access (Jaccard Dissimilarity) and time (Cosine Dissimilarity) features are of a good quality as compared to other feature combinations. The results are compared with the available dissimilarity measure based on the feature of number of hits on a page in a session. Based on the experimental results the proposed approach resulted in better quality clusters. Quality analysis of refined clusters is done using internal cluster quality measures.
机译:会话群集是改善网站结构,提供建议和Web个性化的一种方法,该方法通过从群集中的相似会话创建网页池来实现。会话聚类取决于用于比较会话的各种相似性/不相似性度量。这是当今Web用法挖掘中最需要研究的问题之一。在本文中,我们提出了一种针对会话群集的细化技术,旨在解决日志数据的可伸缩性问题,从而减少对最终用户的推荐范围。该技术考虑了集群中的页面访问,访问时间和会话权重差异以进行优化。我们还根据上述定义的用于评估差异性的特征,对不同的可用差异性度量(如简单差异,Jaccard,方差,模式差异和BLWMN)进行了分析。结果表明,与其他功能组合相比,通过结合使用网页访问(雅卡德相似性)和时间(余弦相似性)特征进行细化生成的聚类具有良好的质量。将结果与基于会话中页面上的点击数特征的可用不相似度度量进行比较。基于实验结果,所提出的方法产生了更好的质量簇。精炼群集的质量分析是使用内部群集质量度量进行的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号