首页> 外文会议>2017 ACM/IEEE Joint Conference on Digital Libraries >Mathematical Document Categorization with Structure of Mathematical Expressions
【24h】

Mathematical Document Categorization with Structure of Mathematical Expressions

机译:具有数学表达结构的数学文档分类

获取原文
获取原文并翻译 | 示例

摘要

A mathematical document is a document subjected to mathematical communication, for example, a math paper and discussion in online Q&A community. Mathematical document categorization (MDC) is a task to classify mathematical documents to mathematical categories, e.g. probability theory and set theory. This task is an important task for supporting user search on recent wide-spreaded digital libraries and archiving services. Although Mathematical expressions (ME) in the document could bring an essential information as being in a central part of communication especially in math fields, how to utilize ME for MDC has not been matured. In this paper, we propose the classi cation method based on text combined with structures of ME, which are supposed to re ect conventions and rules specific to a category. Also, we present document collections built for evaluating the MDC systems, with investigation on categorial settings and its statistics. We demonstrate classi cation results that our proposed method outperforms existing methods with state-of-the-art ME modeling on F-measure.
机译:数学文档是经过数学交流的文档,例如,数学文档和在线问答社区中的讨论。数学文档分类(MDC)是一项将数学文档分类为数学类别的任务,例如概率论和集合论。该任务是支持用户搜索最近广泛使用的数字图书馆和归档服务的一项重要任务。尽管文档中的数学表达式(ME)可能带来必不可少的信息,尤其是在数学领域,但它是交流的核心部分,但如何将ME用于MDC尚未成熟。在本文中,我们提出了一种基于文本的分类方法,该方法结合了ME的结构,这些方法应该能够反映特定于类别的约定和规则。此外,我们还将介绍为评估MDC系统而构建的文档集合,并对类别设置及其统计数据进行调查。我们证明了分类结果,表明我们提出的方法优于基于F度量的最新ME建模方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号