首页> 外文会议>2015 International Conference on Advances in Computer Engineering and Applications >Preprocessing web logs: A critical phase in web usage mining
【24h】

Preprocessing web logs: A critical phase in web usage mining

机译:预处理Web日志:Web使用情况挖掘的关键阶段

获取原文
获取原文并翻译 | 示例

摘要

Web usage mining refers to finding out user access patterns from the web logs of a Website. The Web logs obtained are highly unstructured and this very nature of Web logs makes them unsuitable for mining directly. Hence they go through a stage called preprocessing which not only makes them suitable for analysis but reduces the file size significantly. This paper explores this preprocessing phase in detail and proposes a total and absolute tool for the same which reduces the irrelevant and noisy data and transforms it into a form so that it can be readily used for analysis. The tool has been referred to as total and absolute as after cleaning the data it shows us a summary statistics of the records at the end once they have been preprocessed. The summary statistics highlights the number of records fed as input, elements obtained after carrying out preprocessing and the time utilized in accomplishing the task. Finally it exports the preprocessed data obtained into a .log file which can be very easily imported in any data mining utility. The features of summary statistics and export data can be considered as a distinguishing feature from the other tools which have been proposed earlier.
机译:Web使用挖掘是指从网站的Web日志中找出用户访问模式。获得的Web日志是高度非结构化的,Web日志的这种性质使它们不适合直接进行挖掘。因此,它们经历了一个称为预处理的阶段,不仅使它们适合分析,而且大大减小了文件大小。本文详细探讨了该预处理阶段,并为该预处理阶段提供了一个总体和绝对的工具,该工具可减少无关数据和嘈杂数据并将其转换为某种形式,以便于进行分析。该工具被称为总计和绝对工具,因为在清理数据后,一旦对记录进行了预处理,它就会向我们显示记录的摘要统计信息。摘要统计信息突出显示作为输入输入的记录数,进行预处理后获得的元素以及完成任务所用的时间。最后,它将获得的预处理数据导出到.log文件中,该文件可以很容易地在任何数据挖掘实用程序中导入。摘要统计和导出数据的功能可以被视为与先前提出的其他工具的区别功能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号