首页> 外国专利> WEB BROWSER-BASED SCRAPING SYSTEM AND METHOD

WEB BROWSER-BASED SCRAPING SYSTEM AND METHOD

机译:基于Web浏览器的报文系统和方法

摘要

The present invention relates to a web browser-based scraping system and method, the web browser-based scraping system including: a web browser which is installed in a client device and receives a scraping request from a user, and in which a user certificate for enabling a connection to a target server is stored; a scraping engine which receives the scraping request from the web browser; a relay server which generates a telegram that meets the requirements of the target server necessary for scraping; a plurality of security gateways which receive the scraping request and the telegram from the relay server, connect to the target server with an IP different from the IP of the relay server and perform scraping, deliver scraped data to the relay server, and have respectively different IPs; and a scraping management server (SMS) which manages the IPs and operation states of the plurality of security gateways. When operating, the security gateways transmit their own IP information and ready-for-use states to the scraping management server. Upon receiving the scraping request from the web browser, the relay server requests the IP of a security gateway from the scraping management server, and the scraping management server selects one of the plurality of security gateways and delivers the IP of the selected security gateway to the relay server. The scraping engine delivers, to the relay server, scraping request information including a service script and certificate information about a client.
机译:基于Web浏览器的抓取系统和方法技术领域本发明涉及基于Web浏览器的抓取系统和方法,该基于Web浏览器的抓取系统包括:Web浏览器,其安装在客户端设备中并接收来自用户的抓取请求,并且其中用于存储启用到目标服务器的连接;抓取引擎,其从网络浏览器接收抓取请求;中继服务器,其生成的电报满足刮刮所需的目标服务器的要求;多个安全网关,其从中继服务器接收抓取请求和电报,并以与中继服务器的IP不同的IP连接到目标服务器并执行抓取,将已抓取的数据传递到中继服务器,并且分别具有不同的安全网关。 IP;刮取管理服务器(SMS),其管理多个安全网关的IP和操作状态。在运行时,安全网关将其自己的IP信息和可用状态发送到抓取管理服务器。在从网络浏览器接收到抓取请求时,中继服务器从抓取管理服务器请求安全网关的IP,并且抓取管理服务器选择多个安全网关之一,并将所选择的安全网关的IP传递给安全网关。中继服务器。抓取引擎将包括服务脚本和有关客户端的证书信息在内的抓取请求信息传递到中继服务器。

著录项

  • 公开/公告号WO2020040556A1

    专利类型

  • 公开/公告日2020-02-27

    原文格式PDF

  • 申请/专利权人 FINGER INC.;

    申请/专利号WO2019KR10664

  • 发明设计人 PARK YOUNG JUN;

    申请日2019-08-22

  • 分类号G06F16/951;G06F21/33;H04L29/08;H04L29/06;H04L12/66;

  • 国家 WO

  • 入库时间 2022-08-21 11:13:19

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号