首页> 外国专利> System and a method for focused re-crawling of Web sites

System and a method for focused re-crawling of Web sites

机译：网站集中重新爬网的系统和方法

页面导航

摘要
著录项
相似文献

摘要

A method (100) of crawling the Web (620) is disclosed. The method (100) crawls (120) Web pages on the Web starting from a given (110) set of seed Universal Resource Locators (URLs). Crawled Web pages are partitioned (140) into sets of relevant and irrelevant pages. A set of exclusion and/or inclusion patterns are discovered (150) from the sets of relevant and irrelevant pages, and subsequent crawling of the Web is restricted through the set of exclusion and/or inclusion patterns.

机译：公开了一种对Web（ 620 ）进行爬网的方法（ 100 ）。方法（ 100 ）从给定（ 110 ）组种子通用资源定位符（URL）开始，在Web上爬网（ 120 ）网页。）。爬网的网页被分为（B> 140 ）一组相关和不相关的页面。从一组相关页面和不相关页面中发现了一组排除和/或包含模式（ 150 ），并且通过该组排除和/或包含模式限制了Web的后续爬网。 展开▼

著录项

公开/公告号US2007143263A1

专利类型

公开/公告日2007-06-21

原文格式PDF

申请/专利权人 NEERAJ AGRAWAL;SREERAM VISWANATH BALAKRISHNAN;SACHINDRA JOSHI;
展开▼

申请/专利号US20050314432

发明设计人 NEERAJ AGRAWAL;SACHINDRA JOSHI;SREERAM VISWANATH BALAKRISHNAN;
展开▼

申请日2005-12-21

分类号G06F17/30;

国家 US

入库时间 2022-08-21 21:05:15

相似文献

专利

外文文献

中文文献

1. 一种方便剪枝的爬藤网及爬藤网的制备方法 [P] . 中国专利： CN105724110B8 . 2019.01.04

2. 一种方便剪枝的爬藤网及爬藤网的制备方法 [P] . 中国专利： CN105724110B . 2018.08.24

3. System and method for focused re-crawling of web sites [P] . 美国专利： US7882099B2 . 2011-02-01

机译：网站集中重新爬网的系统和方法

4. System and a method for focused re-crawling of Web sites [P] . 美国专利： US7379932B2 . 2008-05-27

机译：网站集中重新爬网的系统和方法

5. SYSTEM AND METHOD FOR FOCUSED RE-CRAWLING OF WEB SITES [P] . 美国专利： US2008168041A1 . 2008-07-10

机译：网站集中重新抓取的系统和方法

1. ON ANALYSIS METHOD FOR INFRARED MULTI-SITES SYSTEM PERFORMANCE [J] . Cao Zhengwen, Luo Rui, Peng Jinye, 电子科学学刊（英文版） . 2006,第001期

2. Measuring and Improving Web Site Quality: A Consumer Focused System [C] . John M. Ryan International internet software quality week . 2002

3. Agile and context-adaptable methodology for the holistic and systematic evaluation of Web site quality. [D] . Perallos Ruiz, Asier. 2007

4. Barriers and opportunities for breast cancer organizations to focus on environmental health and disease prevention: a mixed-methods approach using website analyses interviews and focus groups [O] . Jennifer Liss Ohayon, Eric Nost, Kami Silk, 2020

5. E-commerce web site evaluation : developing a framework and method for the systematic evaluation of e-commerce web sites and using correspondence analysis to represent the results graphically per industry [O] . Van der Merwe Rian 2001

1. Dynamics of Structural-Inhomogeneous Laminate and Shell Mechanical Systems with Point Constraints and Focused Masses. Part 2. Statement of the Problem of Forced Oscillations, Methods of Solution, Computational Algorithm and Numerical Results [J] . Mirsaidov Mirziyod ,Safarov Ismoil Ibrokhimovich ,Teshaev Mukhsin Khudoyberdievich . 应用数学与应用物理（英文） . 2019,第11期

2. A New Analysis Method for Locating the Focus and for Estimating the Size of the Focus of the Backscatter Light of a LIDAR System [J] . Nianwen Cao ,Weiyuan Wang ,Yonghua Wu . 电磁分析与应用期刊（英文） . 2010,第3期

3. Depth Estimates of Buried Utility Systems Using the GPR Method: Studies at the IAG/USP Geophysics Test Site [J] . Bruno Poluha ,Jorge Luís Porsani ,Emerson Rodrigo Almeida . 地球科学国际期刊（英文） . 2017,第5期

4. Evaluation Method of Web Site Structure Based on Web Structure Mining [J] . Li June 1 Zhou Dongu 2 1. Computer CenterWuhan University Wuhan 430072 Hubei China ,2. School of ComputerWuhan University Wuhan 430072 HubeiChina . 武汉大学自然科学学报：英文版 . 2003,第03A期

5. 智能热网调度系统在城市集中供热系统转型升级中的探索应用 [C] . 李政光 . 2017第十九届中国科协年会 . 2017

6. 基于增量式爬取和非文本内容评估的网站无障碍检测系统 [A] . 徐峰 . 2014