...
首页> 外文期刊>Language Resources and Evaluation >Writer's uncertainty identification in scientific biomedical articles: a tool for automatic if-clause tagging
【24h】

Writer's uncertainty identification in scientific biomedical articles: a tool for automatic if-clause tagging

机译:作者在科学生物医学论文中的不确定性识别:自动IF-CLAGE标记的工具

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In a previous study, we manually identified seven categories (verbs, non-verbs, modal verbs in the simple present, modal verbs in the conditional mood, if, uncertain questions, and epistemic future) of Uncertainty Markers (UMs) in a corpus of 80 articles from the British Medical Journal randomly sampled from a 167-year period (1840-2007). The UMs detected on the base of anepistemic stanceapproach were those referring only to the authors of the articles and only in the present. We also performed preliminary experiments to assess the manual annotated corpus and to establish a baseline for the UMs automatic detection. The results of the experiments showed that most UMs could be recognized with good accuracy, except for the if-category, which includes four subcategories: if-clauses in a narrow sense; if-less clauses; as if/as though; if and whether introducing embedded questions. The unsatisfactory results concerning the if-category were probably due to both its complexity and the inadequacy of the detection rules, which were only lexical, not grammatical. In the current article, we describe a different approach, which combines grammatical and syntactic rules. The performed experiments show that the identification of uncertainty in the if-category has been largely double improved compared to our previous results. The complex overall process of uncertainty detection can greatly profit from a hybrid approach which should combine supervised Machine learning techniques with a knowledge-based approach constituted by a rule-based inference engine devoted to the if-clause case and designed on the basis of the above mentioned epistemic stance approach.
机译:在以前的研究中,我们在语料库中手动确定了七个类别(简单的礼物,如果不确定问题,概念问题和认识到的未经遗传的未经遗传的未来的莫代尔动词)来自英国医学杂志的80篇随机抽查了167年期间(1840-2007)。在吞咽症斯坦克群基础上检测到的UMS是那些仅参考文章的作者,并且仅在现在。我们还对评估手动注释的语料库进行了初步实验,并建立了UMS自动检测的基线。实验结果表明,除了IF-CA类之外,大多数UMS可以以良好的准确性识别,其中包括四个子类别:IF-CLAUSES狭义;如果较少的条款;好像/好像;如果和是否引入嵌入式问题。关于IF-类别的不令人满意的结果可能是由于其复杂性和检测规则的不足,这只是词性,而不是语法。在本文中,我们描述了一种不同的方法,它结合了语法和句法规则。所进行的实验表明,与我们之前的结果相比,IF-Cateric中的不确定性识别在很大程度上是双重改善。不确定度检测的复杂整体过程可以从混合方法中获利,这应该将监督机器学习技术与基于知识的方法相结合,该方法构成了由基于规则的推理引擎,该方法专门用于IF-CARAUSIC案例,并以上述方式设计提到的认知姿态方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号