首页> 外文会议>IAPR International Conference on Document Analysis and Recognition >Evaluating Word String Embeddings and Loss Functions for CNN-Based Word Spotting
【24h】

Evaluating Word String Embeddings and Loss Functions for CNN-Based Word Spotting

机译:评估基于CNN的单词点的单词字符串嵌入和损失函数

获取原文

摘要

The recent past has seen CNNs take over the field of word spotting. The dominance of these neural networks is fueled by learning to predict a word string embedding for a given input image. While the PHOC (Pyramidal Histogram of Characters) is most prominently used, other embeddings such as the Discrete Cosine Transform of Words have been used as well. In this work, we investigate the use of different word string embeddings for word spotting. For this, we make use of the recently proposed PHOCNet and modify it to be able to not only learn binary representations. Our extensive evaluation shows that a large number of combinations of word string embeddings and loss functions achieve roughly the same results on different word spotting benchmarks. This leads us to the conclusion that no word string embedding is really superior to another and new embeddings should focus on incorporating more information than only character counts and positions.
机译:最近,CNN接管了单词发现领域。通过学习预测给定输入图像的词串嵌入,可以增强这些神经网络的优势。虽然最主要使用PHOC(字符金字塔形直方图),但也使用了其他嵌入方式,例如单词的离散余弦变换。在这项工作中,我们研究了使用不同的词串嵌入进行词发现。为此,我们利用了最近提出的PHOCNet并对其进行了修改,使其不仅能够学习二进制表示形式。我们的广泛评估表明,单词串嵌入和损失函数的大量组合在不同的单词发现基准上获得了大致相同的结果。这导致我们得出的结论是,没有任何一个字串嵌入确实比另一个字串嵌入更优越,而新的嵌入应该着重于整合更多的信息,而不仅仅是字符数和位置。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号