首页> 中文期刊>计算机研究与发展 >Web日志的高效多能挖掘算法

Web日志的高效多能挖掘算法

     

摘要

Similar customer groups, relevant Web pages, and frequent accesspaths can be discovered by analyzing of Web log files and customer database. In this paper, novel Web log mining algorithms are presented. First, according to Web site's directed graph defined, a URL-UserID relevant matrix is set up, where URL is taken as row and UserID is taken as column, and each element's value of this matrix is the user's hits. Second, similar customer groups are discovered by measuring similarity between column vectors, and relevant Web pages are obtained by measuring similarity between row vectors; frequent access paths can also be discovered by further processing of the latter. Experiments show the effectiveness of the algorithms.%通过对Web服务器日志文件和客户交易数据进行分析,可以发现相似客户群体、相关Web页面和频繁访问路径.提出了一种新颖的Web日志挖掘算法.在该算法中,首先以Web站点URL为行、以UserID为列建立URL-UserID关联矩阵,元素值为用户的访问次数,然后,对列向量进行相似性分析得到相似客户群体,对行向量进行相似性度量获得相关Web页面,对后者再进一步处理还可以发现频繁访问路径.实验结果表明了算法的有效性.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号