首页> 外文会议>1st Workshop on argumentation mining 2014 >An automated method to build a corpus of rhetorically-classified sentences in biomedical texts
【24h】

An automated method to build a corpus of rhetorically-classified sentences in biomedical texts

机译:在生物医学文本中建立修辞分类句子的语料库的自动化方法

获取原文
获取原文并翻译 | 示例

摘要

The rhetorical classification of sentences in biomedical texts is an important task in the recognition of the components of a scientific argument. Generating supervised machine learned models to do this recognition requires corpora annotated for the rhetorical categories Introduction (or Background), Method, Result, Discussion (or Conclusion). Currently, a few, small annotated corpora exist. We use a straightforward feature of co-referring text using the word "this" to build a self-annotating corpus extracted from a large biomedical research paper dataset. The corpus is annotated for all of the rhetorical categories except Introduction without involving domain experts. In a 10-fold cross-validation, we report an overall F-score of 97% with Naieve Bayes and 98.7% with SVM, far above those previously reported.
机译:生物医学文本中句子的修辞分类是认识科学论证要素的一项重要任务。生成监督的机器学习模型来进行这种识别需要对语料类别的注解进行注解,如“介绍”(或“背景”),“方法”,“结果”,“讨论”(或“结论”)。当前,存在一些带注释的小型语料库。我们使用一个直接的功能,即使用单词“ this”共同引用文本,以构建从大型生物医学研究论文数据集中提取的自我注释语料库。除了“简介”外,所有修辞类别都为该语料进行注释,而领域专家不参与。在10倍交叉验证中,我们报告Naieve Bayes的总体F得分为97%,使用SVM的总体F得分为98.7%,远高于先前报道的水平。

著录项

  • 来源
  • 会议地点 Baltimore MA(US)
  • 作者单位

    Department of Computer Science The University of Western Ontario;

    Department of Computer Science The University of Western Ontario;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-26 14:23:26

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号