【24h】

Realistic Traffic Generation for Web Robots

机译:Web机器人的现实交通生成

获取原文

摘要

Critical to evaluating the capacity, scalability, and availability of web systems are realistic web traffic generators. Web traffic generation is a classic research problem, no generator accounts for the characteristics of web robots or crawlers that are now the dominant source of traffic to a web server. Administrators are thus unable to test, stress, and evaluate how their systems perform in the face of ever increasing levels of web robot traffic. To resolve this problem, this paper introduces a novel approach to generate synthetic web robot traffic with high fidelity. It generates traffic that accounts for both the temporal and behavioral qualities of robot traffic by statistical and Bayesian models that are fitted to the properties of robot traffic seen in web logs from North America and Europe. We evaluate our traffic generator by comparing the characteristics of generated traffic to those of the original data. We look at session arrival rates, inter-arrival times and session lengths, comparing and contrasting them between generated and real traffic. Finally, we show that our generated traffic affects cache performance similarly to actual traffic, using the common LRU and LFU eviction policies.
机译:评估Web系统的容量,可扩展性和可用性至关重要是现实的Web流量生成器。 Web流量是一个经典的研究问题,没有生成器占网球机器人或爬虫的特征,现在是Web服务器的主要流量源。因此,管理员无法测试,压力,并评估其系统在越来越多的Web机器人流量的级别中的表现方式。为了解决这个问题,本文介绍了一种新颖的方法,以产生高保真的合成网机器人流量。它产生了通过统计和贝叶斯模型来实现机器人流量的时间和行为质量的流量,该模型适用于北美和欧洲网络日志中看到的机器人流量的特性。通过将生成流量的特征与原始数据的特征进行比较,我们通过比较来评估我们的流量发生器。我们查看会话到达率,到达时代和会话长度,比较和对比生成和实际流量之间的对比。最后,我们表明我们生成的流量与使用公共LRU和LFU驱逐策略相似地影响缓存性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号