首页> 外文会议>International Conference on Mechanical, Information and Industrial Engineering >Detecting and Normalizing Formulas in Electronic Literature Resources
【24h】

Detecting and Normalizing Formulas in Electronic Literature Resources

机译:检测电子文献资源中的公式

获取原文

摘要

Formulas exist in various kinds of documents with different formats. Extracting and normalizing them into a unique form are the precondition of mathematical retrieval. In this paper, an extraction and conversion method of formulas in Word documents is built for mathematical expression retrieval. Firstly, the mathematical expressions in Word documents are detected through the processing of OLE objects. Then, the matching rules of formula format conversion are defined. Finally, the extracted mathematical expressions in OMML format are converted into LaTeX format follow the defined rules and stored in a txt file. Furthermore, the formulas exist in MathType format are stored in bitmap documents and converted into LaTeX documents through formula recognition and reconstruction module. Experiments show the effectiveness of the designed approach.
机译:公式以不同格式的各种文档存在。提取并将其归一化为唯一形式是数学检索的前提。本文为数学表达检索构建了Word文档中公式的提取和转换方法。首先,通过处理OLE对象来检测单词文档中的数学表达式。然后,定义了公式格式转换的匹配规则。最后,OMML格式中的提取数学表达式将转换为Latex格式遵循定义的规则并存储在TXT文件中。此外,MathType格式中存在的公式存储在位图文档中,并通过公式识别和重建模块转换为乳胶文档。实验表明了设计方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号