首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >Handwritten word-spotting using hidden Markov models and universal vocabularies
【24h】

Handwritten word-spotting using hidden Markov models and universal vocabularies

机译:使用隐藏的马尔可夫模型和通用词汇进行手写单词发现

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Handwritten word-spotting is traditionally viewed as an image matching task between one or multiple query word-images and a set of candidate word-images in a database. This is a typical instance of the query-by-example paradigm. In this article, we introduce a statistical framework for the word-spotting problem which employs hidden Markov models (HMMs) to model keywords and a Gaussian mixture model (GMM) for score normalization. We explore the use of two types of HMMs for the word modeling part: continuous HMMs (C-HMMs) and semi-continuous HMMs (SC-HMMs), i.e. HMMs with a shared set of Gaussians. We show on a challenging multi-writer corpus that the proposed statistical framework is always superior to a traditional matching system which uses dynamic time warping (DTW) for word-image distance computation. A very important finding is that the SC-HMM is superior when labeled training data is scarce-as low as one sample per keyword-thanks to the prior information which can be incorporated in the shared set of Gaussians.
机译:传统上,手写单词发现被视为一个或多个查询单词图像与数据库中一组候选单词图像之间的图像匹配任务。这是按示例查询范例的典型实例。在本文中,我们介绍了针对单词发现问题的统计框架,该框架使用隐马尔可夫模型(HMM)来建模关键字,并使用高斯混合模型(GMM)来进行分数归一化。我们探索在单词建模部分使用两种类型的HMM:连续HMM(C-HMM)和半连续HMM(SC-HMM),即具有一组共享高斯的HMM。我们在具有挑战性的多作者语料库上表明,提出的统计框架始终优于使用动态时间规整(DTW)进行字图像距离计算的传统匹配系统。一个非常重要的发现是,当标记的训练数据稀缺时(每个关键字只有一个样本),SC-HMM的优势就在于可以合并到共享高斯集中的先前信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号