首页> 外文会议>Web engineering >A Strategy for Efficient Crawling of Rich Internet Applications
【24h】

A Strategy for Efficient Crawling of Rich Internet Applications

机译:有效爬网丰富Internet应用程序的策略

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

New web application development technologies such as Ajax, Flex or Silverlight result in so-called Rich Internet Applications (RIAs) that provide enhanced responsiveness, but introduce new challenges for crawling that cannot be addressed by the traditional crawlers. This paper describes a novel crawling technique for RIAs. The technique first generates an optimal crawling strategy for an anticipated model of the crawled RIA by aiming at discovering new states as quickly as possible. As the strategy is executed, if the discovered portion of the actual model of the application deviates from the anticipated model, the anticipated model and the strategy are updated to conform to the actual model. We compare the performance of our technique to a number of existing ones as well as depth-first and breadth-first crawling on some Ajax test applications. The results show that our technique has a better performance often with a faster rate of state discovery.
机译:诸如Ajax,Flex或Silverlight之类的新Web应用程序开发技术产生了所谓的Rich Internet Applications(RIA),该应用程序提供了增强的响应能力,但带来了传统爬虫无法解决的新爬网挑战。本文介绍了一种用于RIA的新型爬网技术。该技术首先通过尽快发现新状态,为预期的RIA预期模型生成了最佳的爬行策略。在执行策略时,如果发现的应用程序实际模型部分偏离预期模型,则预期模型和策略将更新为符合实际模型。我们将我们的技术的性能与许多现有技术的性能以及在某些Ajax测试应用程序上的深度优先和广度优先爬行进行了比较。结果表明,我们的技术通常以更快的状态发现速度具有更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号