首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >Comparative study of conventional time series matching techniques for word spotting
【24h】

Comparative study of conventional time series matching techniques for word spotting

机译:常规时间序列匹配技术对单词斑点的比较研究

获取原文
获取原文并翻译 | 示例
           

摘要

In word spotting literature, many approaches have considered word images as temporal signals that could be matched by classical Dynamic Time Warping algorithm. Consequently, DTW has been widely used as a on the shelf tool. However there exists many other improved versions of DTW, along with other robust sequence matching techniques. Very few of them have been studied extensively in the context of word spotting whereas it has been well explored in other application domains such as speech processing, data mining etc. The motivation of this paper is to investigate such area in order to extract significant and useful information for users of such techniques. More precisely, this paper has presented a comparative study of classical Dynamic Time Warping (DTW) technique and many of its improved modifications, as well as other sequence matching techniques in the context of word spotting, considering both theoretical properties as well as experimental ones. The experimental study is performed on historical documents, both handwritten and printed, at word or line segmentation level and with a limited or extended set of queries. The comparative analysis is showing that classical DTW remains a good choice when there is no segmentation problems for word extraction. Its constrained version (e.g. Itakura Parallelogram) seems better on handwritten data, as well as Hilbert transform also shows promising performances on handwritten and printed datasets. In case of printed data and low level features (pixel's column based), the aggregation of features (e.g. Piecewise-DTW) seems also very important. Finally, when there are important word segmentation errors or when we are considering line segmentation level, Continuous Dynamic Programming (CDP) seems to be the best choice. (C) 2017 Published by Elsevier Ltd.
机译:在斑点文献中,许多方法已经将字图像视为可以通过经典动态时间翘曲算法匹配的时间信号。因此,DTW已被广泛用作搁板工具上。然而,存在许多其他改进版本的DTW,以及其他强大的序列匹配技术。在Word Spotting的背景下,他们中很少有人研究,而在其他应用领域中已经很好地探索,例如语音处理,数据挖掘等。本文的动机是调查此类区域以提取显着且有用这些技术的用户信息。更确切地说,本文介绍了经典动态时间翘曲(DTW)技术的比较研究以及其许多改进的修改,以及在单词斑点的背景下的其他序列匹配技术,考虑到理论属性以及实验性。实验研究是关于手写和印刷的历史文档,单词或线路分割级别以及有限或扩展的查询集。比较分析表明,当Word提取没有分割问题时,古典DTW仍然是一个不错的选择。其约束版本(例如Itakura Packetsogaro科)似乎更好地对手写数据以及Hilbert变换也显示了手写和印刷数据集的有希望的表现。在印刷数据和低级功能(基于像素的柱)的情况下,特征的聚合(例如分段-DTW)似乎也很重要。最后,当存在重要的单词分割错误时或我们考虑线分割级别时,连续动态编程(CDP)似乎是最佳选择。 (c)2017年由elestvier有限公司出版

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号