首页>
外国专利>
A SYSTEM FOR CRAWLING THE WEB AND EXTRACTING DESIGNATED DATA AND THE METHOD THEREFOR I.E. WEBHARVESTER
A SYSTEM FOR CRAWLING THE WEB AND EXTRACTING DESIGNATED DATA AND THE METHOD THEREFOR I.E. WEBHARVESTER
展开▼
机译:用于抓取网络并提取指定数据的系统及其方法WEBHARVESTER
展开▼
页面导航
摘要
著录项
相似文献
摘要
The present invention discloses a system for crawling the Web and extracting designated data and the method therefor, i.e. WebHarvester, said system comprises: a computer system; a database configured in the computer system; templates residing in the computer system for mapping information in target page for each web site; fetch means for fetching web pages from said web sites and transferring the fetched pages to said computer system; filter means for scanning the fetched pages to extract necessary information from the fetched pages from said web sites according to corresponding one of said templates, respectively; format and post means for converting the extracted information into a standard format, and storing the formatted information in said database. Said computer system is a server connected to Internet.
展开▼