首页> 外国专利> METHOD, APPARATUS AND SYSTEM FOR EXTRACTING WEBPAGE CONTENT

METHOD, APPARATUS AND SYSTEM FOR EXTRACTING WEBPAGE CONTENT

机译：提取网页内容的方法，装置和系统

页面导航

摘要
著录项
相似文献

摘要

The present disclosure relates to a method, an apparatus and a system for extracting webpage content. The method for extracting webpage content includes: responding to a webpage browsing instruction triggered on a browser by a mobile client to obtain a corresponding webpage; parsing the webpage to obtain a DOM node of a tag in a webpage script; obtaining a plug-in tag node from the DOM node; and when a plug-in tag corresponding to the plug-in tag node is a predetermined type tag, extracting a plug-in resource that corresponds to the plug-in tag. The present disclosure can complete extracting of content that complies with a specific protocol specification when a webpage has not been truly rendered, thereby improving a speed of extracting predetermined webpage content and also improving a webpage display speed. In addition, because this solution can implement extracting of a plug-in resource on the side of a browser terminal without relying on a background server, this solution is technically easy for implementation and can reduce development costs.

机译：本发明涉及一种用于提取网页内容的方法，装置和系统。所述提取网页内容的方法，包括：响应于移动客户端在浏览器上触发的网页浏览指令，以获取对应的网页;解析网页，以获取网页脚本中标签的DOM节点;从DOM节点获取插件标签节点;当所述插件标签节点对应的插件标签为预定类型标签时，提取所述插件标签对应的插件资源。当尚未真正呈现网页时，本公开可以完成符合特定协议规范的内容的提取，从而提高提取预定网页内容的速度，并且还提高网页显示速度。另外，因为该解决方案可以在不依赖于后台服务器的情况下在浏览器终端侧实现插件资源的提取，所以该解决方案在技术上易于实现并且可以降低开发成本。

著录项

公开/公告号WO2015127882A1

专利类型
公开/公告日2015-09-03

原文格式PDF
申请/专利权人 TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED;
展开▼

申请/专利号WO2015CN73167
发明设计人 GUO XINHUA;SU KE;MA NING;WANG JINGYAO;
展开▼

申请日2015-02-16
分类号G06F17/30;
国家 WO
入库时间 2022-08-21 15:04:38

相似文献

专利
外文文献
中文文献