首页> 外文会议>International Conference on Document Analysis and Recognition >LPGA: Line-of-Sight Parsing with Graph-Based Attention for Math Formula Recognition
【24h】

LPGA: Line-of-Sight Parsing with Graph-Based Attention for Math Formula Recognition

机译:LPGA:具有基于图的注意力的视线解析,用于数学公式识别

获取原文

摘要

We present a model for recognizing typeset math formula images from connected components or symbols. In our approach, connected components are used to construct a line-of-sight (LOS) graph. The graph is used both to reduce the search space for formula structure interpretations, and to guide a classification attention model using separate channels for inputs and their local visual context. For classification, we used visual densities with Random Forests for initial development, and then converted this to a Convolutional Neural Network (CNN) with a second branch to capture context for each input image. Formula structure is extracted as a directed spanning tree from a weighted LOS graph using Edmonds' algorithm. We obtain strong results for formulas without grids or matrices in the InftyCDB-2 dataset (90.89% from components, 93.5% from symbols). Using tools from the CROHME handwritten formula recognition competitions, we were able to compile all symbol and structure recognition errors for analysis. Our data and source code are publicly available.
机译:我们提出了一种用于识别来自连接的组件或符号的排版数学公式图像的模型。在我们的方法中,连接的组件用于构造视图线(LOS)图。该图类用于减少公式结构解释的搜索空间,并使用单独的频道指导分类注意模型,用于输入和它们的本地视觉上下文。对于分类,我们使用具有随机林的视觉密度进行初始开发,然后将其转换为卷积神经网络(CNN),其中第二分支以捕获每个输入图像的上下文。使用Edmonds算法从加权LOS图中提取公式结构作为指向的生成树。我们在InftycdB-2数据集中没有网格或矩阵的公式获得了强劲的结果(来自组件的90.89%,符号93.5%)。使用来自Crohme手写公式识别竞争的工具,我们能够编译所有符号和结构识别错误进行分析。我们的数据和源代码是公开的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号