首页> 外国专利> Method, Server and Device for extracting a Body and a title of a Content of a Web page.

Method, Server and Device for extracting a Body and a title of a Content of a Web page.

机译：提取网页内容的正文和标题的方法，服务器和设备。

页面导航

摘要
著录项
相似文献

摘要

Provide technologies to extract a general body shape and a title of an article displayed on a Web page. A Web page can display content such as advertisements, images, and links to the article of website.A user can select to view the article in an application Reading without additional content, and implementation of Reading can extract the body and the title of the Web page. You can select candidate titles, meta tags associated with identifying and Eliminating the title names Web site meta tags.You can select candidate bodies to identify clusters of nodes on the basis of a text font size and depth in a hierarchy of Document Object Model for the Web page. You can select a better grouping most probably the body and can be selected as the best candidate for a Bachelor's degree.

机译：提供提取网页上显示的文章的一般身体形状和标题的技术。网页可以显示诸如广告，图像和网站文章链接的内容。用户可以选择在应用程序中查看文章阅读，而无需其他内容，阅读的实现可以提取网站的正文和标题页。您可以选择候选标题，与标识和消除标题名称网站元标记关联的元标记。您可以选择候选主体，以根据文本字体大小和文档对象模型层次结构中的深度来标识节点簇。网页。您可以选择更好的分组方式，最有可能是身体，并且可以被选为本科学位的最佳人选。

著录项

公开/公告号AR097694A1

专利类型
公开/公告日2016-04-06

原文格式PDF
申请/专利权人 MICROSOFT TECHNOLOGY LICENSING LLC;
展开▼

申请/专利号AR2014P103468
发明设计人 YANTI ARUSWATI GOUW;RAMAN NARAYANAN;RUIHUA SONG;GUANGPING GAO;SHELLEY SUMMER GU;QIAN ZHANG;MING LIU;
展开▼

申请日2014-09-18
分类号G06F19;G06F17/30;
国家 AR
入库时间 2022-08-21 14:27:37

相似文献

专利
外文文献
中文文献