首页> 外文期刊>Multimedia, IEEE Transactions on >Track, Attend, and Parse (TAP): An End-to-End Framework for Online Handwritten Mathematical Expression Recognition
【24h】

Track, Attend, and Parse (TAP): An End-to-End Framework for Online Handwritten Mathematical Expression Recognition

机译:跟踪,参加和解析(TAP):在线手写数学表达识别的端到端框架

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we introduce Track, Attend, and Parse (TAP), an end-to-end approach based on neural networks for online handwritten mathematical expression recognition (OHMER). The architecture of TAP consists of a tracker and a parser. The tracker employs a stack of bidirectional recurrent neural networks with gated recurrent units (GRU) to model the input handwritten traces, which can fully utilize the dynamic trajectory information in OHMER. Followed by the tracker, the parser adopts a GRU equipped with guided hybrid attention (GHA) to generatenotations. The proposed GHA is composed of a coverage-based spatial attention, a temporal attention, and an attention guider. Moreover, we demonstrate the strong complementarity between offline information with static-image input and online information with ink-trajectory input by blending a fully convolutional networks-based watcher into TAP. Inherently, unlike traditional methods, this end-to-end framework does not require the explicit symbol segmentation and a predefined expression grammar for parsing. Validated on a benchmark published by the CROHME competition, the proposed approach outperforms the state-of-the-art methods and achieves the best reported results with an expression recognition accuracy of 61.16% on CROHME 2014 and 57.02% on CROHME 2016, using only official training dataset.
机译:在本文中,我们介绍了跟踪,参加和分析(TAP),这是一种基于神经网络的端到端方法,用于在线手写数学表达识别(OHMER)。 TAP的体系结构由跟踪器和解析器组成。跟踪器采用带有门控递归单元(GRU)的双向递归神经网络堆栈来对输入的手写轨迹进行建模,从而可以充分利用OHMER中的动态轨迹信息。跟踪器紧随其后,解析器采用配备了引导混合注意(GHA)的GRU来生成 n <内嵌图形xlink:href = “ un01-2844689.eps ” xmlns:mml = “ http:// www .w3.org / 1998 / Math / MathML “ xmlns:xlink = ” http://www.w3.org/1999/xlink “ /> n符号。拟议的GHA由基于覆盖的空间注意,时间注意和注意指导组成。此外,通过将完全基于卷积网络的监视程序混合到TAP中,我们展示了具有静态图像输入的脱机信息与具有墨水轨迹输入的在线信息之间的强大互补性。本质上,与传统方法不同,此端到端框架不需要显式的符号分段和预定义的表达语法来进行解析。该方法在CROHME竞赛发布的基准上得到了验证,其性能优于最新方法,并获得了最佳的报告结果,在CROHME 2014上的表情识别准确度为61.16%,在CROHME 2016上的表情识别准确度为57.02%。 ,仅使用官方培训数据集。

著录项

  • 来源
    《Multimedia, IEEE Transactions on》 |2019年第1期|221-233|共13页
  • 作者单位

    National Engineering Laboratory for Speech and Language Information Processing, University of Science and Technology of China, Hefei, Anhui, China;

    National Engineering Laboratory for Speech and Language Information Processing, University of Science and Technology of China, Hefei, Anhui, China;

    National Engineering Laboratory for Speech and Language Information Processing, University of Science and Technology of China, Hefei, Anhui, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Grammar; Handwriting recognition; Trajectory; Data visualization; Recurrent neural networks; Logic gates;

    机译:语法;手写识别;轨迹;数据可视化;递归神经网络;逻辑门;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号