...
首页> 外文期刊>Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on >Natural Language Morphology Integration in Off-Line Arabic Optical Text Recognition
【24h】

Natural Language Morphology Integration in Off-Line Arabic Optical Text Recognition

机译:离线阿拉伯语光学文本识别中的自然语言形态集成

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In this paper, we propose a new linguistic-based approach called the affixal approach for Arabic word and text image recognition. Most of the existing works in the field integrate the knowledge of the Arabic language in the recognition process in two ways: either in post-recognition using the language of dictionary (dictionary of words) to validate the word hypotheses suggested by the OCR or in the course of the recognition process (recognition directed by a lexicon) using a statistical model of the language (Hidden Markov Model or N-gram). The proposed approach uses the linguistic concepts of the vocabulary to direct and simplify the recognition process. The principal contribution of the proposed approach is to be able to categorize the word hypotheses in words that are either derived or not derived from roots and to characterize morphologically each word hypothesis in order to prepare the text hypotheses for later analyses (for example, syntactic analysis; to filter the sentence hypotheses).
机译:在本文中,我们提出了一种新的基于语言的方法,称为用于阿拉伯语单词和文本图像识别的词缀方法。该领域中现有的大多数作品都通过两种方式将阿拉伯语的知识整合到识别过程中:使用字典语言(单词词典)进行后识别以验证OCR建议的单词假设。使用语言的统计模型(隐马尔可夫模型或N-gram)进行识别过程的过程(由词典指导的识别)。所提出的方法使用词汇的语言概念来指导和简化识别过程。提出的方法的主要贡献是能够将单词假设分类为从词根导出或不从词根导出的单词假设,并在形态上表征每个单词假设,以便为以后的分析(例如句法分析)准备文本假设。 ;过滤句子假设)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号