首页> 外国专利> TASK-CRAWLING SYSTEM AND TASK-CRAWLING METHOD FOR DISTRIBUTED CRAWLER SYSTEM

TASK-CRAWLING SYSTEM AND TASK-CRAWLING METHOD FOR DISTRIBUTED CRAWLER SYSTEM

机译:分布式爬虫系统的任务-抓取系统和任务-抓取方法

摘要

A task-crawling system for a distributed crawler system includes a controlling end, a crawling end, and a task monitoring module. The crawling end acquires a corresponding task, and sends data of the task to the controlling end. The controlling end works for assigning a number to the task, defining a timeout period for the task, generating a task-distribution event, and storing timestamp data of distribution of the task. The controlling end distributes the task distribution to the task monitoring module and the crawling end. The crawling end performs corresponding crawling logic to the crawl task, and sends information about completion of the task to the controlling end. In case of abnormality that prevents the crawl task from being performed properly, the task monitoring module re-pushes the task to the controlling end, thereby avoiding failure of the task otherwise caused by web-related problems.
机译:一种用于分布式爬虫系统的任务爬虫系统,包括控制端,爬虫端和任务监控模块。爬行端获取对应的任务,并将该任务的数据发送给控制端。控制端用于为任务分配编号,定义任务的超时时间,生成任务分发事件以及存储任务分发的时间戳数据。控制端将任务分配分发给任务监视模块和爬网端。爬行端对爬行任务执行相应的爬行逻辑,并将任务完成的信息发送给控制端。在异常情况下阻止抓取任务正常执行时,任务监视模块将任务重新推送到控制端,从而避免了因Web相关问题而导致的任务失败。

著录项

  • 公开/公告号US2017068735A1

    专利类型

  • 公开/公告日2017-03-09

    原文格式PDF

  • 申请/专利权人 MOLBASE (SHANGHAI) BIOTECHNOLOGY CO. LTD .;

    申请/专利号US201615171488

  • 发明设计人 GUO QIANG ZHANG;

    申请日2016-06-02

  • 分类号G06F17/30;H04L29/08;

  • 国家 US

  • 入库时间 2022-08-21 13:46:36

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号