首页> 外文会议>International Conference on Document Analysis and Recognition >Using multiple sequence alignment and statistical language model to integrate multiple Chinese address recognition outputs
【24h】

Using multiple sequence alignment and statistical language model to integrate multiple Chinese address recognition outputs

机译:使用多序列比对和统计语言模型来集成多个中文地址识别输出

获取原文
获取外文期刊封面目录资料

摘要

Different recognizers may result in different mistakes when they are used to recognize a Chinese address. In this paper, we present a method of combining multiple Chinese address recognition outputs to improve Chinese address recognition accuracy. The method first employs multiple sequence alignment to generate a lattice of candidate hypotheses from multiple different recognizer outputs and then applies statistical language model to choose the maximum likelihood candidate sequence. Taking the maximum as the final decision, the performance of our method is superior, compared to the single recognizers and Miyao's method. The experiments on the address images of real envelopes demonstrate that the proposed method increases the character recognition accuracy rate from 95.80% to 98.38%, with 61.30% error reduction. Furthermore, the corrected sorting rate of an automatic mail sorting system increases from 84.11% to 93.72%.
机译:当使用不同的识别器识别中文地址时,可能会导致不同的错误。在本文中,我们提出了一种组合多个中文地址识别输出以提高中文地址识别精度的方法。该方法首先采用多序列比对从多个不同的识别器输出中生成候选假设的格,然后应用统计语言模型来选择最大似然候选序列。与单识别器和Miyao的方法相比,将最大值作为最终决定,我们的方法的性能优越。对真实信封地址图像的实验表明,该方法将字符识别的准确率从95.80%提高到98.38%,减少了61.30%的错误率。此外,自动邮件分拣系统的更正分拣率从84.11%增加到93.72%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号