首页> 外文期刊>BMC Medical Informatics and Decision Making >Disease causality extraction based on lexical semantics and document-clause frequency from biomedical literature
【24h】

Disease causality extraction based on lexical semantics and document-clause frequency from biomedical literature

机译:基于词汇语义和文献从句频率的生物医学文献疾病因果关系提取

获取原文
           

摘要

Background Recently, research on human disease network has succeeded and has become an aid in figuring out the relationship between various diseases. In most disease networks, however, the relationship between diseases has been simply represented as an association. This representation results in the difficulty of identifying prior diseases and their influence on posterior diseases. In this paper, we propose a causal disease network that implements disease causality through text mining on biomedical literature. Methods To identify the causality between diseases, the proposed method includes two schemes: the first is the lexicon-based causality term strength, which provides the causal strength on a variety of causality terms based on lexicon analysis. The second is the frequency-based causality strength, which determines the direction and strength of causality based on document and clause frequencies in the literature. Results We applied the proposed method to 6,617,833 PubMed literature, and chose 195 diseases to construct a causal disease network. From all possible pairs of disease nodes in the network, 1011 causal pairs of 149 diseases were extracted. The resulting network was compared with that of a previous study. In terms of both coverage and quality, the proposed method showed outperforming results; it determined 2.7 times more causalities and showed higher correlation with associated diseases than the existing method. Conclusions This research has novelty in which the proposed method circumvents the limitations of time and cost in applying all possible causalities in biological experiments and it is a more advanced text mining technique by defining the concepts of causality term strength.
机译:背景技术近来,关于人类疾病网络的研究已经成功,并且已经成为弄清各种疾病之间关系的辅助手段。但是,在大多数疾病网络中,疾病之间的关系已简单地表示为关联。这种表示导致难以识别先前的疾病及其对后部疾病的影响。在本文中,我们提出了一种因果疾病网络,该网络通过在生物医学文献上进行文本挖掘来实现疾病因果关系。方法为了识别疾病之间的因果关系,所提出的方法包括两种方案:第一种是基于词典的因果关系项强度,它基于词典分析提供各种因果关系项的因果关系强度。第二个是基于频率的因果关系强度,它基于文献和条款中的频率确定因果关系的方向和强度。结果我们将该方法应用于6,617,833篇PubMed文献,并选择195种疾病来构建因果关系疾病网络。从网络中所有可能的疾病节点对中,提取了149种疾病的1011个因果对。将得到的网络与以前的研究进行了比较。在覆盖率和质量方面,所提出的方法均表现出优异的结果。它确定的因果关系是现有方法的2.7倍,并且与相关疾病的相关性更高。结论本研究具有新颖性,其中所提出的方法在生物实验中应用所有可能的因果性规避了时间和成本的局限,并且通过定义因果性术语强度的概念而成为一种更高级的文本挖掘技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号