首页> 外文会议>IEEE International Conference on Progress in Informatics and Computing >An improved algorithm for identifying mathematical formulas in the images of PDF documents
【24h】

An improved algorithm for identifying mathematical formulas in the images of PDF documents

机译:一种改进算法,用于识别PDF文档图像中的数学公式

获取原文

摘要

Mathematical formula identification is an important part of mathematical formula recognition and retrieval. It is more difficult for extracting formulas from the document images in PDF files because of the diversity of their acquisition ways. To solve the problem, this paper designs a method of mathematical formula identification in English PDF document images, which includes three steps: judging columns, extracting mathematical formula character blocks, merging mathematical formula character blocks. Through analyzing and concluding characteristics of the document images in PDF files as well as its effects on mathematical formula identification, this paper designs a related parameter adjustment algorithm for avoiding influences on the performance of mathematical formula identification caused by the resolution variation. The experimental result shows that the adaptability of mathematical formula identification algorithm is improved by some applications.
机译:数学公式识别是数学公式识别和检索的重要组成部分。由于其采集方式的多样性,从PDF文件中的文档图像中提取公式更困难。为了解决问题,本文设计了一种在英文PDF文档图像中的数学公式识别方法,其中包括三个步骤:判断列,提取数学公式字符块,合并数学公式字符块。通过分析和结论PDF文件中的文档图像的特征以及其对数学公式识别的影响,本文设计了一种相关参数调整算法,用于避免由分辨率变化引起的数学公式识别性能的影响。实验结果表明,一些应用的数学公式识别算法的适应性得到改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号