首页> 外文期刊>International Journal of Scientific & Technology Research >Informative Content Extraction By Using Eifce [Effective Informative Content Extractor]
【24h】

Informative Content Extraction By Using Eifce [Effective Informative Content Extractor]

机译:使用Eifce提取信息内容[有效的信息内容提取器]

获取原文
           

摘要

Abstract: Internet web pages contain several items that cannot be classified as the 'informative content,' e.g., search and filtering panel, navigation links, advertisements, and so on. Most clients and end-users search for the informative content, and largely do not seek the non-informative content. As a result, the need of Informative Content Extraction from web pages becomes evident. Two steps, Web Page Segmentation and Informative Content Extraction, are needed to be carried out for Web Informative Content Extraction. DOM-based Segmentation Approaches cannot often provide satisfactory results. Vision-based Segmentation Approaches also have some drawbacks. So this paper proposes Effective Visual Block Extractor (EVBE) Algorithm to overcome the problems of DOM-based Approaches and reduce the drawbacks of previous works in Web Page Segmentation. And it also proposes Effective Informative Content Extractor (EIFCE) Algorithm to reduce the drawbacks of previous works in Web Informative Content Extraction. Web Page Indexing System, Web Page Classification and Clustering System, Web Information Extraction System can achieve significant savings and satisfactory results by applying the Proposed Algorithms.
机译:摘要:Internet网页包含一些无法归类为“信息内容”的项目,例如搜索和筛选面板,导航链接,广告等。大多数客户和最终用户都在搜索信息性内容,而基本上不寻求非信息性内容。结果,从网页提取信息内容的需求变得明显。 Web信息内容提取需要执行两个步骤,即网页细分和信息内容提取。基于DOM的细分方法通常无法提供令人满意的结果。基于视觉的分割方法也有一些缺点。因此,本文提出了一种有效的可视块提取器(EVBE)算法,以克服基于DOM的方法所存在的问题,并减少了以前在Web网页分割中的缺点。并提出了有效的信息内容提取器(EIFCE)算法,以减少以前在Web信息内容提取中所做的工作。网页索引系统,网页分类和聚类系统,Web信息提取系统可以通过应用建议的算法节省大量资金并获得令人满意的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号