首页> 美国卫生研究院文献>other >NegAIT: A new parser for medical text simplification using morphological sentential and double negation
【2h】

NegAIT: A new parser for medical text simplification using morphological sentential and double negation

机译:NegAIT:一种新的语法分析器用于使用形态句子和双重否定词简化医学文本

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Many different text features influence text readability and content comprehension. Negation is commonly suggested as one such feature, but few general-purpose tools exist to discover negation and studies of the impact of negation on text readability are rare. In this paper, we introduce a new negation parser (NegAIT) for detecting morphological, sentential, and double negation. We evaluated the parser using a human annotated gold standard containing 500 Wikipedia sentences and achieved 95%, 89% and 67% precision with 100%, 80%, and 67% recall, respectively. We also investigate two applications of this new negation parser. First, we performed a corpus statistics study to demonstrate different negation usage in easy and difficult text. Negation usage was compared in six corpora: patient blogs (4 K sentences), Cochrane reviews (91 K sentences), PubMed abstracts (20 K sentences), clinical trial texts (48 K sentences), and English and Simple English Wikipedia articles for different medical topics (60 K and 6 K sentences). The most difficult text contained the least negation. However, when comparing negation types, difficult texts (i.e., Cochrane, PubMed, English Wikipedia and clinical trials) contained significantly (p < 0.01) more morphological negations. Second, we conducted a predictive analytics study to show the importance of negation in distinguishing between easy and difficulty text. Five binary classifiers (Naïve Bayes, SVM, decision tree, logistic regression and linear regression) were trained using only negation information. All classifiers achieved better performance than the majority baseline. The Naïve Bayes’ classifier achieved the highest accuracy at 77% (9% higher than the majority baseline).
机译:许多不同的文本功能会影响文本的可读性和内容理解力。否定通常被认为是这样的功能之一,但是发现否定的通用工具很少,而且很少研究否定对文本可读性的影响。在本文中,我们介绍了一种新的否定分析器(NegAIT),用于检测形态,句子和双重否定。我们使用包含500个Wikipedia句子的人工注释黄金标准评估了解析器,并分别以95%,89%和67%的准确率实现了100%,80%和67%的查全率。我们还将研究此新的求反分析器的两个应用程序。首先,我们进行了语料库统计研究,以证明简单和困难文本中不同否定用法的用法。比较了六个语料库的否定用法:患者博客(4K句子),Cochrane评论(91K句子),PubMed摘要(20K句子),临床试验文本(48K句子)以及针对不同情况的英语和简单英语维基百科文章医学主题(60 K和6 K句子)。最困难的文本包含的否定最少。但是,当比较否定类型时,困难的文章(例如Cochrane,PubMed,英语Wikipedia和临床试验)包含的形态学否定性明显更高(p <0.01)。其次,我们进行了一项预测分析研究,以表明否定在区分容易和困难文本之间的重要性。仅使用求反信息对五个二元分类器(朴素贝叶斯,支持向量机,决策树,逻辑回归和线性回归)进行了训练。所有分类器的性能均优于多数基准。朴素的贝叶斯分类器获得了最高的准确率,为77%(比多数基准高出9%)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号