首页> 外文期刊>Procedia Computer Science >Clustering of Web User Sessions to Maintain Occurrence of Sequence in Navigation Pattern
【24h】

Clustering of Web User Sessions to Maintain Occurrence of Sequence in Navigation Pattern

机译:Web用户会话的聚类,以维护导航模式中序列的出现

获取原文
           

摘要

Web log data available at server side helps in identifying the most appropriate pages based on the user request. Analysis of web log data poses challenges as it consists of abundant information of a web page. In this paper a novel technique has been proposed to pre-process the web log data to extract sequence of occurrence and navigation patterns helpful for prediction. Each URL in the web log data is parsed into tokens based on the web structure. Tokens are uniquely identified for the classification of URLs. The sequence of URLs navigated by a user for a period of 30 minutes is treated as a session. Session represents the navigation pattern of a user. Sessions from multiple users are clustered using hierarchical agglomerative clustering technique to analyze the occurrence of sequence in the navigation patterns. From each cluster, a session is identified as a representative as it holds most possible pages in the sequence, other sessions in the cluster are the subset of the representative session. Session representative navigation patterns are useful for predicting the most appropriate pages for the user request. The proposed model is tested on web log files of NASA and enggresources.
机译:服务器端可用的Web日志数据有助于根据用户请求识别最合适的页面。 Web日志数据的分析提出了挑战,因为它包含网页的大量信息。在本文中,已经提出了一种新颖的技术来预处理Web日志数据,以提取有助于预测的出现顺序和导航模式。 Web日志数据中的每个URL都根据Web结构解析为令牌。标记是唯一标识的,用于URL分类。用户导航30分钟的URL序列被视为会话。会话表示用户的导航模式。使用分层的聚集聚类技术对来自多个用户的会话进行聚类,以分析导航模式中序列的出现。从每个群集中,一个会话被标识为代表,因为它拥有该序列中大多数可能的页面,群集中的其他会话是该代表会话的子集。会话代表导航模式对于预测最适合用户请求的页面很有用。该模型在NASA和enggresources的Web日志文件上进行了测试。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号