...
首页> 外文期刊>Sadhana >A string matching based algorithm for performance evaluation of mathematical expression recognition
【24h】

A string matching based algorithm for performance evaluation of mathematical expression recognition

机译:基于字符串匹配的数学表达式识别性能评估算法

获取原文
           

摘要

In this paper, we have addressed the problem of automated performance evaluation of Mathematical Expression (ME) recognition. Automated evaluation requires that recognition output and ground truth in some editable format like LaTeX, MathML, etc. have to be matched. But standard forms can have extraneous symbols or tags. For example, tag is added for an operator in MathML and egin{array} is used to encoded matrices in LaTeX. These extraneous symbols are also involved in matching that is not intuitive. For that, we have proposed a novel structure encoded string representation that is independent of any editable format. Structure encoded strings retain the structure (spatial relationships like superscript, subscript, etc.) and do not contain any extraneous symbols. As structure encoded strings give the linear representation of MEs, Levenshtein edit distance is used as a measure for performance evaluation. Therefore, in our approach, recognition output and ground truth in LaTeX form are converted to their corresponding structure encoded strings and Levenshtein edit distance is computed between them.
机译:在本文中,我们解决了数学表达式(ME)识别的自动性能评估问题。自动评估要求必须将识别输出和某些可编辑格式(如LaTeX,MathML等)的基本事实进行匹配。但是标准格式可以包含多余的符号或标记。例如,在MathML中为运算符添加了标记, begin {array}用于在LaTeX中编码矩阵。这些无关的符号也涉及不直观的匹配。为此,我们提出了一种新颖的结构编码字符串表示形式,该表示形式独立于任何可编辑格式。结构编码的字符串保留结构(空间关系,如上标,下标等),并且不包含任何多余的符号。由于结构编码的字符串给出了ME的线性表示,因此Levenshtein编辑距离被用作性能评估的量度。因此,在我们的方法中,将LaTeX形式的识别输出和地面实况转换为它们对应的结构编码字符串,并计算它们之间的Levenshtein编辑距离。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号