Informative Content Extraction By Using Eifce [Effective Informative Content Extractor]

Chaw Su Win; Mie Mie Su Thwin

首页> 外文期刊>International Journal of Scientific & Technology Research >Informative Content Extraction By Using Eifce [Effective Informative Content Extractor]

【24h】

Informative Content Extraction By Using Eifce [Effective Informative Content Extractor]

机译：使用Eifce提取信息内容[有效的信息内容提取器]

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Abstract: Internet web pages contain several items that cannot be classified as the 'informative content,' e.g., search and filtering panel, navigation links, advertisements, and so on. Most clients and end-users search for the informative content, and largely do not seek the non-informative content. As a result, the need of Informative Content Extraction from web pages becomes evident. Two steps, Web Page Segmentation and Informative Content Extraction, are needed to be carried out for Web Informative Content Extraction. DOM-based Segmentation Approaches cannot often provide satisfactory results. Vision-based Segmentation Approaches also have some drawbacks. So this paper proposes Effective Visual Block Extractor (EVBE) Algorithm to overcome the problems of DOM-based Approaches and reduce the drawbacks of previous works in Web Page Segmentation. And it also proposes Effective Informative Content Extractor (EIFCE) Algorithm to reduce the drawbacks of previous works in Web Informative Content Extraction. Web Page Indexing System, Web Page Classification and Clustering System, Web Information Extraction System can achieve significant savings and satisfactory results by applying the Proposed Algorithms.

机译：摘要：Internet网页包含一些无法归类为“信息内容”的项目，例如搜索和筛选面板，导航链接，广告等。大多数客户和最终用户都在搜索信息性内容，而基本上不寻求非信息性内容。结果，从网页提取信息内容的需求变得明显。 Web信息内容提取需要执行两个步骤，即网页细分和信息内容提取。基于DOM的细分方法通常无法提供令人满意的结果。基于视觉的分割方法也有一些缺点。因此，本文提出了一种有效的可视块提取器（EVBE）算法，以克服基于DOM的方法所存在的问题，并减少了以前在Web网页分割中的缺点。并提出了有效的信息内容提取器（EIFCE）算法，以减少以前在Web信息内容提取中所做的工作。网页索引系统，网页分类和聚类系统，Web信息提取系统可以通过应用建议的算法节省大量资金并获得令人满意的结果。

著录项

来源
《International Journal of Scientific & Technology Research》 |2013年第6期|共9页
作者
Chaw Su Win; Mie Mie Su Thwin;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类工程基础科学;
关键词

相似文献

外文文献
中文文献
专利

1. Effective Web Page Segmentation Techniques for Informative content Extraction: Review [J] . Sekhar Babu Boddu, Kurra Rajasekhara Rao, Lakshmi Prasanna Pasupuleti International Journal of Applied Engineering Research . 2014,第5期

机译：用于信息内容提取的有效网页细分技术：回顾
2. Optimal Web Page Classification Technique Based on Informative Content Extraction and FA-NBC [J] . A. M. James Raj, F. Sagayraj Francis, P. Julian Benadit Computer Science and Engineering . 2016,第1期

机译：基于信息内容提取和FA-NBC的最优网页分类技术
3. A hybrid approach for extracting informative content from web pages [J] . Erdinc Uzun, Hayri Volkan Agun, Tarik Yerlikaya Information Processing & Management . 2013,第4期

机译：从网页提取信息内容的混合方法
4. Entropy based informative content density approach for efficient web content extraction [C] . Manjusha Annam, G P Sajeev International conference on advances in computing, communications and informatics . 2016

机译：基于熵的信息内容密度方法，可有效地提取Web内容
5. #Gatlinburg: Examining Affective and Informative Social Media Content during the 2016 Gatlinburg Wildfires [D] . Staggs, Kathryn Baker. 2018

机译：#Gatlinburg：在2016年Gatlinburg野火期间检查情感和信息性社交媒体内容
6. Screening of Six Medicinal Plant Extracts Obtained by Two Conventional Methods and Supercritical CO2 Extraction Targeted on Coumarin Content 22-Diphenyl-1-picrylhydrazyl Radical Scavenging Capacity and Total Phenols Content [O] . Maja Molnar, Igor Jerković, Dragica Suknović, 2017

机译：两种常规方法获得的六种药用植物提取物的筛选以及针对香豆素含量22-二苯基-1-吡啶并肼基自由基清除能力和总酚含量的超临界CO2提取
7. Extracting Informative Textual Parts from Web Pages Containing User-Generated Content [O] . Georgios Katsimpras, Efstathios Stamatatos 2013

机译：从包含用户生成内容的网页中提取信息性文本部分

Informative Content Extraction By Using Eifce [Effective Informative Content Extractor]

摘要

著录项

相似文献

相关主题

期刊订阅