首页>
外国专利>
Degraded gray-scale document recognition using pseudo two-dimensional hidden markov models and N-best hypotheses
Degraded gray-scale document recognition using pseudo two-dimensional hidden markov models and N-best hypotheses
展开▼
机译:使用伪二维隐藏马尔可夫模型和N-最佳假设进行灰度文档识别
展开▼
页面导航
摘要
著录项
相似文献
摘要
Methods are disclosed for recognizing connected and degraded text embedded in a gray-scale image. Gray-scale pseudo two-dimensional hidden Markov models (HMMs) are used to represent images containing text elements such as characters or words. Observation vectors for the image are produced from pixel maps obtained by gray-scale optical scanning. Three components are employed to characterize a pixel: a convolved, quantized gray-level component, a pixel relative position component, and a pixel major stroke direction component. These components are organized as an observation vector, which is continuous in nature, invariant in different font sizes, and flexible for use in various quantization processes. In this manner, information loss or distortion due to binarization processes is eliminated; moreover, in cases where documents are binary in nature (e.g., faxed documents), the image may be compressed by subsampling into multi(gray)-level without losing information, thereby enabling recognition of the compressed images in a much shorter time. Furthermore, documents may be scanned and processed in gray-level with much lower resolution than in binary without sacrificing the performance. This can significantly increase the processing speed.
展开▼