Using multiple sequence alignment and statistical language model to integrate multiple Chinese address recognition outputs

机译：使用多序列比对和统计语言模型来集成多个中文地址识别输出

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Different recognizers may result in different mistakes when they are used to recognize a Chinese address. In this paper, we present a method of combining multiple Chinese address recognition outputs to improve Chinese address recognition accuracy. The method first employs multiple sequence alignment to generate a lattice of candidate hypotheses from multiple different recognizer outputs and then applies statistical language model to choose the maximum likelihood candidate sequence. Taking the maximum as the final decision, the performance of our method is superior, compared to the single recognizers and Miyao's method. The experiments on the address images of real envelopes demonstrate that the proposed method increases the character recognition accuracy rate from 95.80% to 98.38%, with 61.30% error reduction. Furthermore, the corrected sorting rate of an automatic mail sorting system increases from 84.11% to 93.72%.

机译：当使用不同的识别器识别中文地址时，可能会导致不同的错误。在本文中，我们提出了一种组合多个中文地址识别输出以提高中文地址识别精度的方法。该方法首先采用多序列比对从多个不同的识别器输出中生成候选假设的格，然后应用统计语言模型来选择最大似然候选序列。与单识别器和Miyao的方法相比，将最大值作为最终决定，我们的方法的性能优越。对真实信封地址图像的实验表明，该方法将字符识别的准确率从95.80％提高到98.38％，减少了61.30％的错误率。此外，自动邮件分拣系统的更正分拣率从84.11％增加到93.72％。

著录项

来源
《International Conference on Document Analysis and Recognition》|2015年|151-155|共5页
会议地点
作者
Shengchang Chen; Shujing Lu; Ying Wen; Yue Lu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
maximum likelihood estimation; natural language processing; statistical analysis; Miyao method; automatic mail sorting system; candidate hypotheses; integrate multiple Chinese address recognition outputs; maximum likelihood candidate sequence; multiple sequence alignment; single recognizers; statistical language model; Image segmentation; Optical character recognition software; Training; minimum edit distance; multiple Chinese address recognition outputs; multiple sequence alignment; statistical language model;

机译：最大似然估计;自然语言处理;统计分析; Miyao方法;自动邮件分拣系统;候选假设;集成多个中文地址识别输出;最大似然候选序列;多个序列比对;单个识别器;统计语言模型;图像分割;光学字符识别软件;培训;最小编辑距离;多个中文地址识别输出;多个序列比对;统计语言模型;

相似文献

外文文献
中文文献
专利

1. A unified statistical model of protein multiple sequence alignment integrating direct coupling and insertions [J] . Akira R. Kinjo Biophysics and Physicobiology . 2016,第2期

机译：整合了直接偶联和插入的蛋白质多序列比对的统一统计模型
2. Gene Sequences Parallel Alignment Model Based on Multiple Inputs and Outputs [J] . Xiaolong Feng, Jing Gao International journal of computers, communications and control . 2019,第2期

机译：基于多个输入和输出的基因序列并行比对模型
3. Gene Sequences Parallel Alignment Model Based on Multiple Inputs and Outputs [J] . Xiaolong Feng, Jing Gao IAENG Internaitonal journal of computer science . 2019,第2期

机译：基因序列基于多输入和输出的并行对准模型
4. Using multiple sequence alignment and statistical language model to integrate multiple Chinese address recognition outputs [C] . Shengchang Chen, Shujing Lu, Ying Wen, International Conference on Document Analysis and Recognition . 2015

机译：使用多个序列对齐和统计语言模型来集成多个中文地址识别输出
5. Multiple alignments of protein structures and their application to sequence annotation with hidden Markov models. [D] . Scheeff, Eric David. 2003

机译：蛋白质结构的多重比对及其在具有隐马尔可夫模型的序列注释中的应用。
6. A unified statistical model of protein multiple sequence alignment integrating direct coupling and insertions [O] . Akira R. Kinjo 2016

机译：整合直接偶联和插入的蛋白质多序列比对的统一统计模型
7. A unified statistical model of protein multiple sequence alignment integrating direct coupling and insertions [O] . Kinjo, Akira R. 2015

机译：蛋白质多序列比对的统一统计模型集成直接耦合和插入
8. Statistical Filtering of Time-Sequenced Peak Correlation Responses for DistortionInvariant Recognition of Multiple Input Objects [R] . Walsh, T. R., Giles, M. K. 1990

机译：多输入目标失真不变识别的时序相关峰值相关响应的统计滤波

Using multiple sequence alignment and statistical language model to integrate multiple Chinese address recognition outputs

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅