...
首页> 外文期刊>Data Mining and Knowledge Discovery >Web robot detection techniques: overview and limitations
【24h】

Web robot detection techniques: overview and limitations

机译:网络机器人检测技术:概述和局限性

获取原文
获取原文并翻译 | 示例
           

摘要

Most modern Web robots that crawl the Internet to support value-added services and technologies possess sophisticated data collection and analysis capabilities. Some of these robots, however, may be ill-behaved or malicious, and hence, may impose a significant strain on a Web server. It is thus necessary to detect Web robots in order to block undesirable ones from accessing the server. Such detection is also essential to ensure that the robot traffic is considered appropriately in the performance and capacity planning of Web servers. Despite a variety of Web robot detection techniques, there is no consensus regarding a single technique, or even a specific “type” of technique, that performs well in practice. Therefore, to aid in the development of a practically applicable robot detection technique, this survey presents a critical analysis and comparison of the prevalent detection approaches. We propose a framework to classify the existing detection techniques into four categories based on their underlying detection philosophy. We compare the different classes to gain insights into those characteristics that make up an effective robot detection scheme. Finally, we discuss why the contemporary techniques fail to offer a general solution to the robot detection problem and propose a set of key ingredients necessary for strong Web robot detection.
机译:大多数现代Internet机器人都在Internet上爬行以支持增值服务和技术,它们具有复杂的数据收集和分析功能。但是,其中一些机器人可能行为不端或恶意,因此可能会对Web服务器造成很大的压力。因此,有必要检测Web机器人,以阻止不需要的机器人访问服务器。这种检测对于确保在Web服务器的性能和容量规划中适当考虑机器人流量也至关重要。尽管有各种各样的Web机器人检测技术,但对于在实践中表现良好的单一技术甚至特定“类型”的技术尚无共识。因此,为帮助开发实用的机器人检测技术,本次调查对流行的检测方法进行了严格的分析和比较。我们提出了一个框架,根据其基础的检测原理将现有的检测技术分为四类。我们比较不同的类别,以深入了解构成有效的机器人检测方案的那些特征。最后,我们讨论了为什么现代技术无法为机器人检测问题提供通用解决方案,并提出了强大的Web机器人检测所需的一组关键要素。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号