AJAX Crawl: Making AJAX Applications Searchable

机译：AJAX抓取：使AJAX应用程序可搜索

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Current search engines such as Google and Yahoo! are prevalent for searching the Web. Search on dynamic client-side Web pages is, however, either inexistent or far from perfect, and not addressed by existing work, for example on Deep Web. This is a real impediment since AJAX and Rich Internet Applications are already very common in the Web. AJAX applications are composed of states which can be seen by the user, but not by the search engine, and changed by the user using client-side events. Current search engines either ignore AJAX applications or produce false negatives. The reason is that crawling client-side code is a difficult problem that cannot be solved naively by invoking user events. The challenges are: lack of caching, duplicate states detection, very granular events, reducing the number of AJAX calls and infinite event invocation. This paper sets the stage for this new search challenge and proposes a solution: it shows how an AJAX Web application can be crawled in the granularity of the application states. A model of AJAX Web sites is presented. An AJAX Crawler and optimizations for caching and duplicate elimination are defined, and finally, the gain in search result quality and corresponding performance price are evaluated on YouTube, a real AJAX application.

机译：当前的搜索引擎，例如Google和Yahoo!在网络搜索中很普遍。但是，在动态客户端Web页面上的搜索是不存在的，或者是远远不够的，并且现有工作（例如在Deep Web上）无法解决。这是一个真正的障碍，因为AJAX和Rich Internet Applications在Web中已经非常普遍。 AJAX应用程序由状态组成，用户可以看到这些状态，但是搜索引擎无法看到它们，并且用户可以使用客户端事件对其进行更改。当前的搜索引擎要么忽略AJAX应用程序，要么产生假阴性。原因是，爬网客户端代码是一个很难解决的问题，无法通过调用用户事件来天真的解决。面临的挑战是：缺乏缓存，重复状态检测，非常精细的事件，减少AJAX调用次数和无限事件调用。本文为这一新的搜索挑战奠定了基础，并提出了一个解决方案：它展示了如何以应用程序状态的粒度对AJAX Web应用程序进行爬网。提出了一种AJAX网站模型。定义了AJAX搜寻器以及用于缓存和消除重复的优化，最后，在真实的AJAX应用程序YouTube上评估了搜索结果质量和相应的性能价格。

著录项

来源
《Data Engineering, ICDE, 2009 IEEE 25th International Conference on》|2009年|P.78-89|共12页
会议地点
作者
Duda; Cristian; Frey; Gianni; Kossmann; Donald; Matter; Reto; Zhou; Chong;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类工业技术;
关键词

相似文献

外文文献
中文文献
专利

1. CRAWLING AJAX-BASED WEB APPLICATIONS: EVOLUTION AND STATE-OF-THE-ART [J] . Shah Khalid, Shah Khusro, Irfan Ullah Malaysian Journal of Computer Science . 2018,第1期

机译：爬行基于AJAX的Web应用程序：演化和最新技术
2. Crawling AJAX-Based Web Applications through Dynamic Analysis of User Interface State Changes [J] . ALI MESBAH, ARIE VAN DEURSEN, STEFAN LENSELINK ACM transactions on the web . 2012,第1期

机译：通过动态分析用户界面状态更改来爬行基于AJAX的Web应用程序
3. ReAjax: a reverse engineering tool for Ajax Web applications [J] . Marchetto A., Tonella P., Ricca F. Software, IET . 2012,第1期

机译：ReAjax：用于Ajax Web应用程序的逆向工程工具
4. AJAX Crawl: Making AJAX Applications Searchable [C] . Duda Cristian, Frey Gianni, Kossmann Donald, IEEE International Conference on Data Engineering . 2009

机译：ajax爬网：使Ajax应用程序可搜索
5. Data Management Issues and Optimizations in an Ajax Application Framework [D] . Zhao, Keliang. 2018

机译：Ajax应用程序框架中的数据管理问题和优化
6. Metabolic and Cardiorespiratory Responses of Semiprofessional Football Players in Repeated Ajax Shuttle Tests and Curved Sprint Tests and Their Relationship with Football Match Play [O] . Tomasz Gabrys, Arkadiusz Stanula, Urszula Szmatlan-Gabrys, 2020

机译：半贫认足球运动员在重复的Ajax班车测试和弯曲冲刺试验中的代谢和心肺反应以及与足球比赛的关系
7. 0Crawling AJAX-based Web Applications through Dynamic Analysis of User Interface State Changes [O] . Ali Mesbah, Arie Van Deursen, Stefan Lenselink 2015

机译：0通过动态分析用户界面状态更改来抓取基于aJaX的Web应用程序

AJAX Crawl: Making AJAX Applications Searchable

摘要

著录项

相似文献

相关主题

期刊订阅