首页> 外国专利> Degraded gray-scale document recognition using pseudo two- dimensional hidden Markov models and N-best hypotheses

Degraded gray-scale document recognition using pseudo two- dimensional hidden Markov models and N-best hypotheses

机译:使用伪二维隐马尔可夫模型和N最佳假设进行灰度文档识别

摘要

The present invention provides a method for recognizing connected and degraded text embedded in a gray-scale image. In accordance with the invention, pseudo two-dimensional hidden Markov models (PHMMs) are used to represent characters. Observation vectors for the gray-scale image are produced from pixel maps obtained by gray-scale optical scanning. Three components are employed to characterize a pixel: a convoluted, quantized gray-level component, a pixel relative position component, and a pixel major stroke direction component. These components are organized as an observation vector, which is continuous in nature, invariant in different font sizes, and flexible for use in various quantization processes. In this matter, information loss or distortion due to binarization processes is eliminated; moreover, in cases where documents are binary in nature (e. g., faxed documents), the bi-level image may be compressed by subsampling into multi(gray)-level without losing information, thereby enabling recognition of the compressed images in a much shorter time. Furthermore, documents in gray-level may be scanned and processed with much lower resolution than in binary without sacrificing the performance. This can also significantly increase the processing speed.
机译:本发明提供了一种用于识别嵌入在灰度图像中的连接的和降级的文本的方法。根据本发明,伪二维隐藏马尔可夫模型(PHMM)用于表示字符。从通过灰度光学扫描获得的像素图中生成灰度图像的观察矢量。采用三个分量来表征像素:卷积,量化的灰度分量,像素相对位置分量和像素主笔划方向分量。这些组件被组织为观察向量,其本质上是连续的,不同字体大小不变且可灵活用于各种量化过程。在这种情况下,消除了由于二值化过程导致的信息丢失或失真;此外,在文档本质上是二进制的情况下(例如,传真文档),可以通过二次采样将双层图像压缩为多(灰色)级别而不会丢失信息,从而可以在更短的时间内识别出压缩图像。此外,可以在不牺牲性能的情况下以比二进制文件低得多的分辨率来扫描和处理灰度文档。这也可以显着提高处理速度。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号