首页> 外文期刊>IEEE Transactions on Pattern Analysis and Machine Intelligence >Hidden tree Markov models for document image classification
【24h】

Hidden tree Markov models for document image classification

机译:隐藏树马尔可夫模型用于文档图像分类

获取原文
获取原文并翻译 | 示例

摘要

Classification is an important problem in image document processing and is often a preliminary step toward recognition, understanding, and information extraction. In this paper, the problem is formulated in the framework of concept learning and each category corresponds to the set of image documents with similar physical structure. We propose a solution based on two algorithmic ideas. First, we obtain a structured representation of images based on labeled XY-trees (this representation informs the learner about important relationships between image subconstituents). Second, we propose a probabilistic architecture that extends hidden Markov models for learning probability distributions defined on spaces of labeled trees. Finally, a successful application of this method to the categorization of commercial invoices is presented.
机译:分类是图像文档处理中的重要问题,通常是迈向识别,理解和信息提取的第一步。在本文中,问题是在概念学习的框架内提出的,每个类别对应于具有相似物理结构的图像文档集。我们提出了一种基于两种算法思想的解决方案。首先,我们基于标记的XY树获得图像的结构化表示(此表示可以使学习者了解图像子成分之间的重要关系)。其次,我们提出了一种概率体系结构,该体系结构扩展了隐马尔可夫模型,用于学习在标记树的空间上定义的概率分布。最后,介绍了该方法在商业发票分类中的成功应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号