首页> 外文会议>Document recognition and retrieval XVII >A New Pre-classification Method based on Associative Matching Method
【24h】

A New Pre-classification Method based on Associative Matching Method

机译:基于关联匹配法的预分类新方法

获取原文
获取原文并翻译 | 示例

摘要

Reducing the time complexity of character matching is critical to the development of efficient Japanese Optical Character Recognition (OCR) systems. To shorten processing time, recognition is usually split into separate pre-classification and recognition stages. For high overall recognition performance, the pre-classification stage must both have very high classification accuracy and return only a small number of putative character categories for further processing. Furthermore, for any practical system, the speed of the pre-classification stage is also critical. The associative matching (AM) method has often been used for fast pre-classification, because its use of a hash table and reliance solely on logical bit operations to select categories makes it highly efficient. However, redundant certain level of redundancy exists in the hash table because it is constructed using only the minimum and maximum values of the data on each axis and therefore does not take account of the distribution of the data. We propose a modified associative matching method that satisfies the performance criteria described above but in a fraction of the time by modifying the hash table to reflect the underlying distribution of training characters. Furthermore, we show that our approach outperforms pre-classification by clustering, ANN and conventional AM in terms of classification accuracy, discriminative power and speed. Compared to conventional associative matching, the proposed approach results in a 47% reduction in total processing time across an evaluation test set comprising 116,528 Japanese character images.
机译:减少字符匹配的时间复杂度对于开发高效的日本光学字符识别(OCR)系统至关重要。为了缩短处理时间,通常将识别分为单独的预分类和识别阶段。为了获得较高的整体识别性能,预分类阶段必须既具有很高的分类精度,又必须返回少量的假定字符类别以进行进一步处理。此外,对于任何实际系统,预分类阶段的速度也很关键。关联匹配(AM)方法通常用于快速预分类,因为它使用哈希表,并且仅依靠逻辑位操作来选择类别,因此非常高效。但是,哈希表中存在一定程度的冗余,因为它仅使用每个轴上的数据的最小值和最大值构造,因此不考虑数据的分布。我们提出一种修改后的关联匹配方法,该方法可以满足上述性能标准,但可以通过修改哈希表以反映训练字符的基本分布而在短时间内完成。此外,我们表明,在分类准确度,判别力和速度方面,我们的方法优于聚类,ANN和常规AM进行的预分类。与传统的关联匹配相比,所提出的方法在包含116,528个日语字符图像的评估测试集中,总处理时间减少了47%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号