...
首页> 外文期刊>Intelligent data analysis >Identifying user sessions from web server logs with integer programming
【24h】

Identifying user sessions from web server logs with integer programming

机译:使用整数编程从Web服务器日志中识别用户会话

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Web usage mining has proven to be an important advance for e-business systems, both by finding web user buying patterns and suggesting ways to improve web user navigation. A primary input for web usage mining is web user sessions that must be constructed from web server logs (called sessionization) when such sessions are not otherwise identified. We use bipartite cardinality matching and a more general integer program to construct sessions. We also propose several variations of our integer program to provide additional insights into session characteristics. For testing, we retrieve 15 months of web server logs and corresponding real sessions from an academic web site. We compare real sessions, results obtained by our optimization models, and results from a commonly-used timeout heuristic. We find our optimization models dominate the timeout heuristic using several comparison measures. Solution time for a typical month is seven hours for our integer program, 30 minutes for our bipartite cardinality matching, and about 1 minute for the heuristic. Although solution time is significantly greater for the integer program, its variations contribute additional analysis of web user behavior.
机译:通过发现Web用户购买模式并提出改善Web用户导航的方法,Web使用挖掘已被证明是电子商务系统的重要进步。 Web使用情况挖掘的主要输入是Web用户会话,如果没有另外标识,则必须从Web服务器日志中构建这些会话(称为会话化)。我们使用二分基数匹配和更通用的整数程序来构造会话。我们还提出了整数程序的几种变体,以提供有关会话特征的更多见解。为了进行测试,我们从一个学术网站上检索了15个月的Web服务器日志和相应的实际会话。我们比较实际会话,通过优化模型获得的结果以及常用超时启发式方法的结果。我们发现,我们的优化模型使用几种比较方法来主导超时启发式算法。对于整数程序,通常一个月的解决时间为七个小时,对于二分基数匹配,则为30分钟,对于启发式算法,则为1分钟左右。尽管整数程序的求解时间明显更长,但它的变化形式有助于进一步分析Web用户行为。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号