首页> 外文会议>International Conference on Document Analysis and Recognition >Keyword Spotting in Online Chinese Handwritten Documents with Candidate Scoring Based on Semi-CRF Model
【24h】

Keyword Spotting in Online Chinese Handwritten Documents with Candidate Scoring Based on Semi-CRF Model

机译:基于半CRF模型的候选人评分在线汉语手写文件中的关键字发现

获取原文

摘要

For text-query-based keyword spotting from handwritten Chinese documents, the index is usually organized as a candidate lattice to overcome the ambiguity of character segmentation. Each edge in the lattice denotes a candidate character associated with a candidate class. Character similarity (between character and class) scores are calculated on each edge, and the similarity between a query word and handwriting is obtained by combining these edge scores. In this paper, we propose a document indexing method using semi-Markov conditional random fields (semi-CRFs), which provide a principled framework for fusing the information of different contexts. For fast retrieval and to save storage space, the lattice is first purged by a forward-backward pruning approach. On the reduced lattice, we estimate the character similarity scores based on the semi-CRF model. Experimental results on a large handwriting database CASIAOLHWDB justify the effectiveness of the proposed method.
机译:对于来自手写中文文档的基于文本查询的关键字,索引通常被组织为候选格子以克服字符分割的歧义。晶格中的每个边缘表示与候选类相关联的候选字符。在每个边缘上计算字符相似性(字符和类之间)分数,并且通过组合这些边缘分数来获得查询字和手写之间的相似性。在本文中,我们提出了一种使用Semi-Markov条件随机字段(半CRF)的文档索引方法,它提供了一个定义了不同上下文信息的原则框架。为了快速检索并保存存储空间,首先通过前后修剪方法清除格子。在降低的晶格上,我们基于半CRF模型估计字符相似度分数。实验结果对大型手写数据库Casiaolhwdb证明了该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号