...
首页> 外文期刊>Journal of healthcare engineering. >Distant Supervision with Transductive Learning for Adverse Drug Reaction Identification from Electronic Medical Records
【24h】

Distant Supervision with Transductive Learning for Adverse Drug Reaction Identification from Electronic Medical Records

机译:远程监督因转导学习对电子医疗记录的不良药物反应识别

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Information extraction and knowledge discovery regarding adverse drug reaction (ADR) from large-scale clinical texts are very useful and needy processes. Two major difficulties of this task are the lack of domain experts for labeling examples and intractable processing of unstructured clinical texts. Even though most previous works have been conducted on these issues by applying semisupervised learning for the former and a word-based approach for the latter, they face with complexity in an acquisition of initial labeled data and ignorance of structured sequence of natural language. In this study, we propose automatic data labeling by distant supervision where knowledge bases are exploited to assign an entity-level relation label for each drug-event pair in texts, and then, we use patterns for characterizing ADR relation. The multiple-instance learning with expectation-maximization method is employed to estimate model parameters. The method applies transductive learning to iteratively reassign a probability of unknown drug-event pair at the training time. By investigating experiments with 50,998 discharge summaries, we evaluate our method by varying large number of parameters, that is, pattern types, pattern-weighting models, and initial and iterative weightings of relations for unlabeled data. Based on evaluations, our proposed method outperforms the word-based feature for NB-EM (iEM), MILR, and TSVM with F1 score of 11.3%, 9.3%, and 6.5% improvement, respectively.
机译:关于来自大规模临床文本的不良药物反应(ADR)的信息提取和知识发现是非常有用的和有需要的过程。这项任务的两个主要困难是缺乏用于标记示例和非结构化临床文本的难以处理的领域专家。尽管在这些问题上通过应用了半体验学习来对前者和基于词的方法进行了最先前的作品,但它们在收购初始标记的数据和无知的自然语言序列的初始标记数据和无知时,它们面临着复杂性。在本研究中,我们提出了通过远程监督的自动数据标记,其中利用知识库以在文本中为每个药物事件对分配实体级关系标签,然后我们使用用于表征ADR关系的模式。使用期望 - 最大化方法的多实例学习用于估计模型参数。该方法应用转导学习,以在训练时间迭代地重新分配未知药物事件对的概率。通过调查具有50,998个放电摘要的实验,我们通过改变大量参数来评估我们的方法,即模式类型,图案加权模型以及未标记数据关系的初始和迭代权重。根据评估,我们提出的方法优于NB-EM(IEM),MILR和TSVM的基于词的特征,F1分别分别为11.3%,9.3%和6.5%的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号