首页>
外国专利>
Method, Server and Device for extracting a Body and a title of a Content of a Web page.
Method, Server and Device for extracting a Body and a title of a Content of a Web page.
展开▼
机译:提取网页内容的正文和标题的方法,服务器和设备。
展开▼
页面导航
摘要
著录项
相似文献
摘要
Provide technologies to extract a general body shape and a title of an article displayed on a Web page. A Web page can display content such as advertisements, images, and links to the article of website.A user can select to view the article in an application Reading without additional content, and implementation of Reading can extract the body and the title of the Web page. You can select candidate titles, meta tags associated with identifying and Eliminating the title names Web site meta tags.You can select candidate bodies to identify clusters of nodes on the basis of a text font size and depth in a hierarchy of Document Object Model for the Web page. You can select a better grouping most probably the body and can be selected as the best candidate for a Bachelor's degree.
展开▼