首页> 外文会议>World Multi-conference on Systemics, Cybernetics and Informatics >Greek Alphabet Recognition Technique for Biomedical Documents
【24h】

Greek Alphabet Recognition Technique for Biomedical Documents

机译:生物医学文件的希腊字母识别技术

获取原文

摘要

Most current commercial optical character recognition (OCR) systems can accurately recognize the text in documents written in a single language. However, when dealing with Greek characters embedded in predominantly English text, these systems do not perform well, and most OCR systems do not recognize the characters as belonging to the Greek alphabet. As a result, the degree of manual review required to validate and correct OCR errors is high. To handle this problem, we propose a new technique based on features calculated from the output of multiple OCR systems, and combined with string pattern matching and document content analysis to improve the recognition of both Greek characters and regular text. Our proposed technique uses two passes of a document page image through OCR systems that use different recognition languages. Experiments carried out on a sample of medical journals show the feasibility of using the proposed technique for Greek character recognition. Preliminary evaluation conducted on a sample of medical journal page images shows that our approach improves the recognition of Greek characters embedded within predominantly English language text.
机译:大多数当前的商业光学字符识别(OCR)系统可以准确地识别以单一语言编写的文档中的文本。但是,在处理主要嵌入英文文本的希腊字符时,这些系统不会表现良好,并且大多数OCR系统不识别属于希腊字母的字符。因此,验证和正确OCR错误所需的手动审查程度很高。为了处理这个问题,我们提出了一种基于从多个OCR系统输出计算的功能的新技术,并结合字符串模式匹配和文档内容分析,以提高希腊字符和常规文本的识别。我们所提出的技术通过使用不同识别语言的OCR系统使用文档页面图像的两个传递。对医学期刊样本进行的实验表明了使用所提出的希腊字符识别技术的可行性。在医学期刊页面图像样本中进行的初步评估表明,我们的方法可以提高嵌入在主要英语文本中的希腊字符的认可。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号