Many web applications employ AJAX to enhance their user experience. But many properties of AJAX can make traditional search engines crawl web applications hardly. Google’s AJAX crawling scheme was currently sup-ported only by Google, because it suggests webmasters change their website architectures and add additional code. In view of this, the paper presented an AJAX crawling scheme based on document object model and breadth-first crawling algorithm. It can establish a state transition graph of an A-JAX web application, through tracking the changes of the DOM tree. Then it builds a static mirror site of the original AJAX web application. Experimental results show that the AJAX crawling scheme can really crawl Ajax application.%许多WEB应用程序采用AJAX技术来增强用户体验。但是AJAX的一些特性使它在传统搜索引擎实施抓取操作时非常困难。如谷歌的AJAX爬行方案需要网站改变架构并添加额外的代码,因此只被谷歌支持。针对这种情况,提出了一种基于文档对象模型和广度优先爬行算法的A-JAX爬行方案,它可以通过跟踪由AJAX事件引发的DOM树的变化,来建立AJAX WEB应用程序的状态转换图,进而生成原始AJAX WEB应用程序的静态镜像站点。实验证明,该AJAX爬行方案确实可以爬行AJAX应用程序。
展开▼