首页> 美国卫生研究院文献>AMIA Annual Symposium Proceedings >Automatic Extraction of Drug Indications from FDA Drug Labels
【2h】

Automatic Extraction of Drug Indications from FDA Drug Labels

机译:从FDA药品标签自动提取药品指示

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Extracting computable indications, i.e. drug-disease treatment relationships, from narrative drug resources is the key for building a gold standard drug indication repository. The two steps to the extraction problem are disease named-entity recognition (NER) to identify disease mentions from a free-text description and disease classification to distinguish indications from other disease mentions in the description. While there exist many tools for disease NER, disease classification is mostly achieved through human annotations. For example, we recently resorted to human annotations to prepare a corpus, LabeledIn, capturing structured indications from the drug labels submitted to FDA by pharmaceutical companies. In this study, we present an automatic end-to-end framework to extract structured and normalized indications from FDA drug labels. In addition to automatic disease NER, a key component of our framework is a machine learning method that is trained on the LabeledIn corpus to classify the NER-computed disease mentions as “indication vs. non-indication.” Through experiments with 500 drug labels, our end-to-end system delivered 86.3% F1-measure in drug indication extraction, with 17% improvement over baseline. Further analysis shows that the indication classifier delivers a performance comparable to human experts and that the remaining errors are mostly due to disease NER (more than 50%). Given its performance, we conclude that our end-to-end approach has the potential to significantly reduce human annotation costs.
机译:从叙述性药物资源中提取可计算的适应症,即药物与疾病的治疗关系,是建立黄金标准药物适应症资料库的关键。提取问题的两个步骤是:疾病命名实体识别(NER),用于从自由文本描述中识别疾病提及;疾病分类,以将适应症与描述中的其他疾病提及区分开。尽管存在许多用于疾病NER的工具,但疾病分类大多是通过人工注释来实现的。例如,我们最近通过人工注释来准备语料库LabeledIn,以从制药公司提交给FDA的药品标签中捕获结构化的适应症。在这项研究中,我们提出了一个自动的端到端框架,以从FDA药品标签中提取结构化和规范化的适应症。除自动疾病NER之外,我们框架的关键组件是一种机器学习方法,该方法在LabeledIn语料库上进行了训练,以将NER计算得出的疾病提及分类为“适应症与非适应症”。通过使用500种药物标签进行的实验,我们的端到端系统在药物适应症提取中提供了86.3%的F1量度,比基线提高了17%。进一步的分析表明,适应症分类器的性能可与人类专家媲美,而其余的错误主要是由于NER引起的(超过50%)。鉴于其性能,我们得出结论,我们的端到端方法具有显着降低人工标注成本的潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号