首页> 外国专利> Method and apparatus for extracting web page content

Method and apparatus for extracting web page content

机译：提取网页内容的方法和装置

页面导航

摘要
著录项
相似文献

摘要

Methods and apparatus for extracting web page content are provided herein. An exemplary method can be implemented by a mobile terminal. A request command to open a first web page can be received. Whether a source code contains text content tags can be determined. When the source code corresponding to the first web page contains the text content tags, text content of the first web page enclosed within the text content tags can be extracted by a reader. When the source code does not contain the text content tags, a start position and an end position to indicate the text content of the first web page can be identified in the source code. The text content tags can be respectively added after the start position and before the end position. The text content of the first web page enclosed within the text content tags can then be extracted.

机译：本文提供了用于提取网页内容的方法和设备。可以由移动终端实现示例性方法。可以接收打开第一网页的请求命令。可以确定源代码是否包含文本内容标签。当与第一网页相对应的源代码包含文本内容标签时，阅读器可以提取包含在文本内容标签内的第一网页的文本内容。当源代码不包含文本内容标签时，可以在源代码中标识指示第一网页的文本内容的开始位置和结束位置。可以在开始位置之后和结束位置之前分别添加文本内容标签。然后可以提取包含在文本内容标签中的第一个网页的文本内容。

著录项

公开/公告号US9934206B2

专利类型
公开/公告日2018-04-03

原文格式PDF
申请/专利权人 TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED;
展开▼

申请/专利号US201414341446
发明设计人 TINGYONG TANG;YULEI LIU;WEI LI;XI WANG;BO HU;KAI ZHANG;BOSEN HE;YING HUANG;HUIJIAO YANG;ZHENGKAI XIE;ZHIPEI WANG;CHENG FENG;SIRUI LIU;
展开▼

申请日2014-07-25
分类号G06F17/22;G06F3/0484;G06F3/0488;G06F17/30;
国家 US
入库时间 2022-08-21 12:55:30

相似文献

专利
外文文献
中文文献