首页> 外国专利> System and method for improved string matching under noisy channel conditions

System and method for improved string matching under noisy channel conditions

机译:在噪声信道条件下改进字符串匹配的系统和方法

摘要

Described is a system and method for improving string matching in a noisy channel environment. The invention provides a method for identifying string candidates and analyzing the probability that the string candidate matches a user-defined string. In one implementation, a find engine receives a query string, converts an image file into a textual file, and identifies each instance of the query string in the textual file. The find engine identifies candidates within the textual file that may match the query string. The find engine refers to a confusion table to help identify whether candidates that are near matches to the query string are actually matches to the query string but for a common recognition error. Candidates meeting a probability threshold are identified as matches to the query string. The invention further provides for analysis options including word heuristics, language models, and OCR confidences.
机译:描述了一种用于在嘈杂的信道环境中改善字符串匹配的系统和方法。本发明提供了一种用于识别候选字符串并分析候选字符串与用户定义的字符串匹配的概率的方法。在一个实现中,查找引擎接收查询字符串,将图像文件转换为文本文件,并在文本文件中标识查询字符串的每个实例。查找引擎在文本文件中标识可能与查询字符串匹配的候选项。查找引擎引用混淆表以帮助识别与查询字符串接近匹配的候选对象是否实际上与查询字符串匹配,但存在常见的识别错误。满足概率阈值的候选人被标识为与查询字符串匹配。本发明还提供了分析选项,包括单词试探法,语言模型和OCR置信度。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号