首页> 外文会议>International Conference on Tools with Artificial Intelligence >A New Vision-Based Method for Extracting Academic Information from Conference Web Pages
【24h】

A New Vision-Based Method for Extracting Academic Information from Conference Web Pages

机译:一种新的基于视觉信息,用于从会议网页提取学术信息

获取原文

摘要

This paper proposes a new vision-based method for extracting academic information from conference Web pages. The main contributions include: (1) An new vision-based page segmentation algorithm is proposed to improve the result of classical VIPS algorithm. This algorithm can divide pages into text blocks. (2) All text blocks are classified as 10 categories according to vision features, keyword features and text content features. The initial classification results have 75% precision and 67% recall. (3) The context information of text blocks are employed to repair and refine initial classification results, which are improved to 96% precision and 98% recall. Finally, academic information is extracted from classified text blocks. Our experimental results on real-world datasets show that the proposed method is effective and efficient for extracting academic information from conference Web pages.
机译:本文提出了一种新的基于视觉信息,用于从会议网页提取学术信息。 主要贡献包括:(1)提出了一种新的基于视觉的页面分段算法来改善经典VIP算法的结果。 此算法可以将页面划分为文本块。 (2)根据Vision功能,关键字功能和文本内容功能,所有文本块都被分类为10类。 初始分类结果具有75%的精确度和67%的召回。 (3)采用文本块的上下文信息来修复和改进初始分类结果,其提高到96%的精度和98%的召回。 最后,从分类的文本块中提取学术信息。 我们对现实世界数据集的实验结果表明,该方法是从会议网页提取学术信息的有效和有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号