首页> 外文会议>International conference on computer technology and development >FARSI/ARABIC DOCUMENT IMAGE RETRIEVAL THROUGH SUB- LETTER SHAPE CODING
【24h】

FARSI/ARABIC DOCUMENT IMAGE RETRIEVAL THROUGH SUB- LETTER SHAPE CODING

机译:通过子字母形状编码的粗略/阿拉伯文档图像检索

获取原文

摘要

In this paper, A Novel method for Recognition free Farsi document retrieval is proposed.In this method, the retrieval is done through recognition of subletters and other elements of letters such as dots and some signs like Sarkesh.So at first in pre processing phase, lines and words are extracted using blank space between them.In the next phase, each word is divided to its sub-words.A sub-word is a combination of joint letters.For each sub-word, connectors of sub-letters are removed from the initial body of it and remains are recognized as subletters by using of their extracted features.The recognized sub-letters are encoded using a dictionary that has been defined in this system.Finally, the document content is encoded and this code can be used for retrieval of existing words in this document. Experimental results show advantages of this method in the retrieval of Persian printed documents.
机译:本文提出了一种新颖的无波斯语波斯语文档检索方法,该方法通过识别子字母和字母的其他元素(例如点和Sarkesh等某些符号)来完成检索,因此首先在预处理阶段,行和单词之间使用空格来提取。在下一阶段,将每个单词划分为其子单词。一个子单词是联合字母的组合,对于每个子单词,都删除了子字母的连接符从其最初的主体开始,并通过使用其提取的特征将其识别为子字母。使用该系统中定义的词典对识别的子字母进行编码。最后,对文档内容进行编码,并且可以使用此代码用于检索本文档中的现有单词。实验结果表明,该方法在检索波斯印刷文档中具有优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号