首页> 外文期刊>Journal of electronic imaging >Text, photo, and line extraction in scanned documents
【24h】

Text, photo, and line extraction in scanned documents

机译:扫描文档中的文本,照片和行提取

获取原文
获取原文并翻译 | 示例
       

摘要

We propose a page layout analysis algorithm to classify a scanned document into different regions such as text, photo, or strong lines. The proposed scheme consists of five modules. The first module performs several image preprocessing techniques such as image scaling, filtering, color space conversion, and gamma correction to enhance the scanned image quality and reduce the computation time in later stages. Text detection is applied in the second module wherein wavelet transform and run-length encoding are employed to generate and validate text regions, respectively. The third module uses a Markov random field based block-wise segmentation that employs a basis vector projection technique with maximum a posteriori probability optimization to detect photo regions. In the fourth module, methods for edge detection, edge linking, line-segment fitting, and Hough transform are utilized to detect strong edges and lines. In the last module, the resultant text, photo, and edge maps are combined to generate a page layout map using K-Means clustering. The proposed algorithm has been tested on several hundred documents that contain simple and complex page layout structures and contents such as articles, magazines, business cards, dictionaries, and newsletters, and compared against state-of-the-art page-segmentation techniques with benchmark performance. The results indicate that our methodology achieves an average of ~89% classification accuracy in text, photo, and background regions.
机译:我们提出一种页面布局分析算法,以将扫描的文档分类为不同的区域,例如文本,照片或粗线。拟议的方案包括五个模块。第一个模块执行多种图像预处理技术,例如图像缩放,滤波,色彩空间转换和伽玛校正,以提高扫描图像的质量并减少以后的计算时间。在第二模块中应用文本检测,其中小波变换和行程编码分别用于生成和验证文本区域。第三模块使用基于马尔可夫随机场的逐块分段,该分段采用具有最大后验概率优化的基本矢量投影技术来检测照片区域。在第四个模块中,利用边缘检测,边缘链接,线段拟合和霍夫变换的方法来检测强边缘和线。在最后一个模块中,使用K-Means聚类将生成的文本,照片和边缘图组合起来以生成页面布局图。该算法已在包含简单和复杂页面布局结构和内容(例如文章,杂志,名片,词典和新闻通讯)的数百个文档上进行了测试,并与具有基准的最新页面细分技术进行了比较性能。结果表明,我们的方法在文本,图片和背景区域中的分类准确率平均达到了约89%。

著录项

  • 来源
    《Journal of electronic imaging》 |2012年第3期|033006.1-033006.18|共18页
  • 作者单位

    University College London Department of Electronic and Electrical Engineering Optical Networks Group Torrington Place London WC1E 7JE, United Kingdom;

    IPPLEX Holdings Corporation Santa Monica California 90025;

    Rochester Institute of Technology Department of Electrical and Microelectronic Engineering Rochester, New York 14623;

    Hewlett-Packard Corporation Imaging Asset Team Boise, Idaho 83714;

    Hewlett-Packard Corporation Imaging Asset Team Boise, Idaho 83714;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-18 01:17:45

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号