首页> 外文期刊>Journal of Information Science >Micro-mining and segmented log file analysis: a method for enriching the data yield from Internet log files
【24h】

Micro-mining and segmented log file analysis: a method for enriching the data yield from Internet log files

机译:微挖掘和分段日志文件分析:一种从Internet日志文件中丰富数据产量的方法

获取原文
获取原文并翻译 | 示例

摘要

The authors propose improved ways of analysing web server log files. Traditionally web site statistics focus on giving a big (and shallow) picture analysis based on all transaction log entries. The pictures are, however, distorted because of the problems associated with resolving Internet protocol (IP) numbers to a single user and cross-border IP registration. The authors argue that analysing extracted sub-groups and categories presents a more accurate picture of the data and that the analysis of the online behaviour of selected individuals (rather than of very large groups) can add much to our understanding of how people use web sites and, indeed, any digital information source. The analysis is labelled 'micro' to distinguish it from traditional macro, big picture transactional log analysis. The methods are illustrated with recourse to the logs of the SurgeryDoor (www.surgerydoor.co.uk) consumer health web site. It was found that use attributed to academic users gave a better approximation of the sites' geographical distribution of users than an analysis based on all users. This occurs as academic institutions, unlike other user types, register in their host country. Selecting log entries where each user is allocated a unique IP number can be particularly beneficial, especially to analyses of returnees. Finally the paper tracks the online behaviour of a small number of IP numbers, in an example of the application of microanalysis,
机译:作者提出了改进的分析Web服务器日志文件的方法。传统上,网站统计信息专注于基于所有事务日志条目进行大(浅)图片分析。但是,由于与将Internet协议(IP)编号解析为单个用户和跨境IP注册相关的问题,图片失真。作者认为,分析提取的子组和类别可以更准确地描述数据,对选定个人(而不是非常大的组)的在线行为的分析可以增加我们对人们如何使用网站的理解实际上,任何数字信息源。该分析标记为“微”,以区别于传统的宏,大图事务日志分析。借助SurgeryDoor(www.surgerydoor.co.uk)消费者健康网站的日志来说明这些方法。研究发现,与基于所有用户的分析相比,归因于学术用户的使用对网站的用户地理分布具有更好的近似性。这是因为学术机构与其他用户类型不同,在其所在国家/地区注册。选择为每个用户分配唯一IP地址的日志条目可能特别有益,特别是对于返回者的分析。最后,本文以微量分析的应用示例为例,追踪了少数IP地址的在线行为,

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号