首页> 外文期刊>International Journal of Pattern Recognition and Artificial Intelligence >ON THE INFLUENCE OF WORD REPRESENTATIONS FOR HANDWRITTEN WORD SPOTTING IN HISTORICAL DOCUMENTS
【24h】

ON THE INFLUENCE OF WORD REPRESENTATIONS FOR HANDWRITTEN WORD SPOTTING IN HISTORICAL DOCUMENTS

机译:单词表示对历史文献中手写单词发现的影响

获取原文
获取原文并翻译 | 示例

摘要

Word spotting is the process of retrieving all instances of a queried keyword from a digital library of document images. In this paper we evaluate the performance of different word descriptors to assess the advantages and disadvantages of statistical and structural models in a framework of query-by-example word spotting in historical documents. We compare four word representation models, namely sequence alignment using DTW as a baseline reference, a bag of visual words approach as statistical model, a pseudo-structural model based on a Loci features representation, and a structural approach where words are represented by graphs. The four approaches have been tested with two collections of historical data: the George Washington database and the marriage records from the Barcelona Cathedral. We experimentally demonstrate that statistical representations generally give a better performance, however it cannot be neglected that large descriptors are difficult to be implemented in a retrieval scenario where word spotting requires the indexation of data with million word images.
机译:单词发现是从文档图像数字库中检索查询到的关键字的所有实例的过程。在本文中,我们评估了不同单词描述符的性能,以在历史文档中逐字查询的框架中评估统计和结构模型的优缺点。我们比较了四个单词表示模型,即使用DTW作为基线参考的序列比对,一袋视觉单词方法作为统计模型,基于Loci特征表示的伪结构模型以及其中单词由图形表示的结构方法。这四种方法已通过两个历史数据集进行了测试:乔治华盛顿数据库和巴塞罗那大教堂的婚姻记录。我们通过实验证明,统计表示通常可以提供更好的性能,但是不能忽略在单词识别需要索引具有百万个单词图像的数据的检索方案中难以实现大型描述符的情况。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号