...
首页> 外文期刊>Journal of biomedical informatics. >A knowledge-driven conditional approach to extract pharmacogenomics specific drug-gene relationships from free text
【24h】

A knowledge-driven conditional approach to extract pharmacogenomics specific drug-gene relationships from free text

机译:一种从自由文本中提取药物基因组学特定药物基因关系的知识驱动条件方法

获取原文
获取原文并翻译 | 示例
           

摘要

An important task in pharmacogenomics (PGx) studies is to identify genetic variants that may impact drug response. The success of many systematic and integrative computational approaches for PGx studies depends on the availability of accurate, comprehensive and machine understandable drug-gene relationship knowledge bases. Scientific literature is one of the most comprehensive knowledge sources for PGx-specific drug-gene relationships. However, the major barrier in accessing this information is that the knowledge is buried in a large amount of free text with limited machine understandability. Therefore there is a need to develop automatic approaches to extract structured PGx-specific drug-gene relationships from unstructured free text literature. In this study, we have developed a conditional relationship extraction approach to extract PGx-specific drug-gene pairs from 20. million MEDLINE abstracts using known drug-gene pairs as prior knowledge. We have demonstrated that the conditional drug-gene relationship extraction approach significantly improves the precision and F1 measure compared to the unconditioned approach (precision: 0.345 vs. 0.11; recall: 0.481 vs. 1.00; F1: 0.402 vs. 0.201). In this study, a method based on co-occurrence is used as the underlying relationship extraction method for its simplicity. It can be replaced by or combined with more advanced methods such as machine learning or natural language processing approaches to further improve the performance of the drug-gene relationship extraction from free text. Our method is not limited to extracting a drug-gene relationship; it can be generalized to extract other types of relationships when related background knowledge bases exist.
机译:药物基因组学(PGx)研究的一项重要任务是确定可能影响药物反应的遗传变异。 PGx研究的许多系统和集成计算方法的成功取决于准确,全面和机器可理解的药物基因关系知识库的可用性。科学文献是针对PGx特定药物基因关系的最全面的知识来源之一。但是,访问此信息的主要障碍是,该知识被埋在大量的自由文本中,而机器可读性有限。因此,需要开发一种自动方法,以从非结构化自由文本文献中提取结构化PGx特异性药物基因关系。在这项研究中,我们已经开发了一种条件关系提取方法,以使用已知的药物基因对作为先验知识从2000万份MEDLINE摘要中提取PGx特异性药物基因对。我们已经证明,与无条件方法相比,有条件的药物基因关系提取方法显着提高了精度和F1度量(精度:0.345对0.11;召回率:0.481对1.00; F1:0.402对0.201)。在这项研究中,基于共现的方法由于其简单性而被用作基础关系提取方法。它可以被更高级的方法(例如机器学习或自然语言处理方法)取代或与之结合,以进一步提高从自由文本中提取药物基因关系的性能。我们的方法不仅限于提取药物与基因的关系。当存在相关的背景知识库时,可以概括为提取其他类型的关系。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号