随着互联网的迅速发展,网络资源日益丰富,如何从Web尤其是Deep Web中获取信息成为人们关注的焦点,以Ajax为基础的新一代网页信息抓取问题也逐渐成为研究热点.通过分析支持Ajax的Deep Web爬虫关键技术,提出了支持Ajax的Deep Web爬虫的体系结构,阐述了一种自动爬行Ajax网站的算法,为该爬虫的总体框架设计奠定了基础.%With the rapid development of Internet, the network resources are getting more and more abundant, how to extract information from network, especially from Deep Web has been focused on. A new generation of Ajax-based web information extraction has become a hot topic. By analyzing the key technology of the Ajax-supported Deep Web Crawler, this paper puts forward the architecture of the Ajax-Supported Deep Web Crawler, and illustrates an algorithm to crawl the Ajax-supported Deep Web automatically, which lay the foundation for the design of the overall framework of an Ajax-supported Deep Web Crawler.
展开▼