...
首页> 外文期刊>ACM transactions on the web >Crawling AJAX-Based Web Applications through Dynamic Analysis of User Interface State Changes
【24h】

Crawling AJAX-Based Web Applications through Dynamic Analysis of User Interface State Changes

机译:通过动态分析用户界面状态更改来爬行基于AJAX的Web应用程序

获取原文
获取原文并翻译 | 示例
           

摘要

Using JavaScript and dynamic DOM manipulation on the client side of Web applications is becoming a widespread approach for achieving rich interactivity and responsiveness in modern Web applications. At the same time, such techniques-collectively known as Ajax-shatter the concept of webpages with unique URLs, on which traditional Web crawlers are based. This article describes a novel technique for crawling AJAX-based applications through automatic dynamic analysis of user-interface-state changes in Web browsers. Our algorithm scans the DOM tree, spots candidate elements that are capable of changing the state, fires events on those candidate elements, and incrementally infers a state machine that models the various navigational paths and states within an AJAX application. This inferred model can be used in program comprehension and in analysis and testing of dynamic Web states, for instance, or for generating a static version of the application. In this article, we discuss our sequential and concurrent Ajax crawling algorithms. We present our open source tool called Crawljax, which implements the concepts and algorithms discussed in this article. Additionally, we report a number of empirical studies in which we apply our approach to a number of open-source and industrial Web applications and elaborate on the obtained results.
机译:在Web应用程序的客户端使用JavaScript和动态DOM操作正在成为一种广泛的方法,以在现代Web应用程序中实现丰富的交互性和响应性。同时,这种技术(统称为Ajax)粉碎了具有唯一URL的网页的概念,而传统Web爬网程序基于该URL。本文介绍了一种通过对Web浏览器中的用户界面状态变化进行自动动态分析来对基于AJAX的应用程序进行爬网的新颖技术。我们的算法扫描DOM树,发现能够更改状态的候选元素,在这些候选元素上触发事件,并逐步推断出状态机,该状态机对AJAX应用程序中的各种导航路径和状态进行建模。例如,此推断的模型可以用于程序理解以及动态Web状态的分析和测试中,或用于生成应用程序的静态版本。在本文中,我们讨论了顺序和并发Ajax爬行算法。我们介绍了名为Crawljax的开源工具,该工具实现了本文讨论的概念和算法。此外,我们报告了许多实证研究,在这些实证研究中,我们将我们的方法应用于许多开源和工业Web应用程序,并对获得的结果进行了详细说明。

著录项

  • 来源
    《ACM transactions on the web》 |2012年第1期|p.3.1-3.30|共30页
  • 作者单位

    Department of Electrical and Computer Engineering, University of British Columbia, 2332 Main Mall, V6T1Z4 Vancouver, BC, Canada;

    Delft University of Technology Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Mekelweg 4, 2628CD Delft, The Netherlands;

    Delft University of Technology Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Mekelweg 4, 2628CD Delft, The Netherlands;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    crawling; ajax; web 2.0; hidden web; dynamic analysis; DOM crawling;

    机译:爬行阿贾克斯Web 2.0;隐藏的网动态分析;DOM爬行;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号