首页> 外文会议>International Workshop on Document Analysis Systems >Automatic Indexing of Newspaper Microfilm Images
【24h】

Automatic Indexing of Newspaper Microfilm Images

机译:报纸微型图像的自动索引

获取原文

摘要

This paper describes a proposed document analysis system that aims at automatic indexing of digitized images of old newspaper microfilms. This is done by extracting news headlines images of old newspaper microfilms. This is done by extracting news headlines from microfilm images. The headlines are then converted to machine readable text by OCR to serve as indices to the respective news articles. A major challenge to us is the poor image quality of the microfilm as most images are usually inadequately illuminated and considerably dirty. To overcome the problem we propose a new effective method for separating characters from noisy background since conventional threshold selection techniques are inadequate to deal with these kinds of images. A Run Length Smearing Algorithm (RLSA) is then applied to the headline extraction. Experimental results confirm the validity of the approach.
机译:本文介绍了一个拟议的文档分析系统,其旨在自动索引旧报纸微磁性的数字化图像。这是通过提取旧报纸微磁性的新闻标题图像来完成的。这是通过从微杂散图像中提取新闻头标来完成的。然后,OCR将标题转换为机器可读文本,以作为各个新闻文章的索引。对我们的主要挑战是微杂散的差的图像质量,因为大多数图像通常不充分照亮并且显着脏污。为了克服问题,我们提出了一种新的有效方法,用于从嘈杂的背景中分离字符,因为传统的阈值选择技术不充分处理这些种类的图像。然后将运行长度涂抹算法(RLSA)应用于标题提取。实验结果证实了这种方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号