首页> 外文会议>Industrial Applications of AI (Artificial Intelligence) >Features for word spotting in historical manuscripts
【24h】

Features for word spotting in historical manuscripts

机译:历史手稿中的单词发现功能

获取原文
获取原文并翻译 | 示例

摘要

For the transition from traditional to digital libraries, the large number of handwritten manuscripts that exist pose a great challenge. Easy access to such collections requires an index, which is currently created manually at great cost. Because automatic handwriting recognizers fail on historical manuscripts, the word spotting technique has been developed: the words in a collection are matched as images and grouped into clusters which contain all instances of the same word. By annotating "interesting" clusters, an index that links words to the locations where they occur can be built automatically. Due to the noise in historical documents, selecting the right features for matching words is crucial. We analyzed a range of features suitable for matching words using dynamic time warping (DTW), which aligns and compares sets of features extracted from two images. Each feature's individual performance was measured on a test set. With an average precision of 72%, a combination of features outperforms competing techniques in speed and precision.
机译:对于从传统图书馆到数字图书馆的过渡,现有的大量手写手稿构成了巨大的挑战。要轻松访问此类集合,需要一个索引,该索引当前是手动创建的,成本很高。由于自动手写识别器在历史手稿上失败,因此开发了单词识别技术:将集合中的单词作为图像进行匹配,并分组为包含同一单词所有实例的簇。通过注释“有趣的”群集,可以自动建立将单词链接到单词出现位置的索引。由于历史文献中的杂音,选择合适的特征以匹配单词至关重要。我们使用动态时间规整(DTW)分析了适合匹配单词的一系列特征,该特征对齐并比较了从两个图像中提取的特征集。每个功能的个别性能均在测试集上进行了测量。这些功能的组合平均精度为72%,在速度和精度方面均优于同类技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号