首页> 外国专利> METHODS AND DEVICES THAT CONVERT THE IMAGES OF DOCUMENTS TO ELECTRONIC DOCUMENTS USING A TRIE-STRUCTURE OF DATA CONTAINING UNPARAMETED SYMBOLS FOR DETERMINING DEFINITIONS

METHODS AND DEVICES THAT CONVERT THE IMAGES OF DOCUMENTS TO ELECTRONIC DOCUMENTS USING A TRIE-STRUCTURE OF DATA CONTAINING UNPARAMETED SYMBOLS FOR DETERMINING DEFINITIONS

机译:使用包含参数的无符号符号的数据结构来将文件的图像转换为电子文件的方法和设备

摘要

The current application is directed to methods and systems that convert document images, which contain Arabic text and text in other languages in which symbols are joined together to produce continuous words and portions of words, into corresponding electronic documents. In one implementation, a document-image-processing method and system to which the current application is directed employs numerous techniques and features that render efficiently computable an otherwise intractable or impractical document-image-to-electronic-document conversion. These techniques and features include transformation of text-image morphemes and words into feature symbols with associated parameters, efficiently identifying similar morphemes and words in an electronic store of standard-feature-symbol-encoded morphemes and words, and identifying candidate inter-character division points and corresponding traversal paths using the similar morphemes and words identified in the word store.
机译:当前的申请针对将文档图像转换成相应的电子文档的方法和系统,该文档图像包含阿拉伯文本和其他语言的文本,其中符号被结合在一起以产生连续的单词和单词的一部分。在一种实施方式中,当前申请针对的文档图像处理方法和系统采用了许多技术和特征,这些技术和特征使得可以有效地计算否则难以处理或不切实际的文档图像到电子文档的转换。这些技术和功能包括将文本图像语素和单词转换为具有相关参数的特征符号,在标准功能符号编码的语素和单词的电子存储中有效地识别相似的语素和单词,以及识别候选的字符间分割点以及使用词库中标识的类似词素和词的相应遍历路径。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号