首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >A synthesised word approach to word retrieval in handwritten documents
【24h】

A synthesised word approach to word retrieval in handwritten documents

机译:手写文档中单词检索的综合单词方法

获取原文
获取原文并翻译 | 示例
       

摘要

Recent technological advances have enhanced the computer-based indexing and searching of digitised printed books. The performance now achievable in this domain, however, does not at present extend to handwritten texts which inherently contain more significant letter-based variation within their content. Furthermore, in most studies that address the handwritten text retrieval problem, a large training dataset is required which, very often, influences the context and search lexicon. In this paper a novel method is described to overcome the training data problem using a character-based modelling (termed grapheme spectrum) approach and a word modelling technique (termed synthesised word) enabling the retrieval of keywords that have not explicitly been seen in the training set. When tested on an illustrative historical manuscript the performance of the proposed word retrieval technique shows a clear advantage over existing methods.
机译:最近的技术进步已经增强了基于计算机的对数字化印刷书籍的索引和搜索。但是,目前在该领域中可以实现的性能目前还不扩展到手写文本,这些手写文本在其内容中固有地包含基于字母的更重要的变化。此外,在大多数解决手写文本检索问题的研究中,需要大量的训练数据集,而这通常会影响上下文和搜索词典。在本文中,描述了一种新颖的方法,该方法使用基于字符的建模(称为字素谱)方法和词建模技术(称为合成词)来克服训练数据问题,从而能够检索在训练中未明确看到的关键字组。当在说明性的历史手稿上进行测试时,所提出的单词检索技术的性能显示出优于现有方法的明显优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号