首页> 外文学位 >Literature-based discovery: Finding implicit associations between genes and diseases.
【24h】

Literature-based discovery: Finding implicit associations between genes and diseases.

机译:基于文献的发现:发现基因与疾病之间的隐式关联。

获取原文
获取原文并翻译 | 示例

摘要

Swanson has argued that due to explosive growth in publications and overspecialization in the sciences certain types of knowledge based on known logical premises that share common concepts remain undiscovered. Establishing linkages between such logical premises by applying automated or semi-automated techniques to identify evidence in published literature is called literature-based association discovery. Although the potential of literature-based association discovery has been widely recognized, several key problems remain. First, the approaches are largely exploratory and rely on heuristics. Second, the approaches make use of annotations of keywords or mere word (co-)occurrences in text, considering word variations and synonyms at most. Third, they under-utilize or ignore available resources, such as online ontologies or thesauri. Lastly, the approaches lack strong validity as they have not been extensively evaluated. This study attempts to address these key problems. Targeting gene-disease associations as a testbed, we propose a modeling approach based on the Bayesian inference network whereby genes and diseases are represented as nodes and are connected via two types of intermediate nodes, namely gene functions and phenotypes. To estimate the probabilities involved in the model, two learning frameworks are compared. The first framework uses an established set of gene-disease associations. The second framework employs a larger but weaker set of gene-phenotype associations. Also, two learning techniques, one baseline scheme that uses co-annotations of keywords and the other that takes advantage of online free-text information, are presented. Our proposed approach is evaluated on a benchmark data set created from real-world data. The evaluation demonstrated that (a) the proposed approach is effective for predicting specific gene-disease associations, (b) use of free-text information consistently produces better performance than the use of keyword annotations, (c) information acquired from full-text documents can significantly improve discovery as compared to information extracted only from abstracts, and (d) domain ontologies can be leveraged to enhance discovery of associations between gene functions and phenotypes.
机译:斯旺森认为,由于出版物的爆炸性增长和科学领域的过度专业化,基于共享共同概念的已知逻辑前提的某些类型的知识仍未被发现。通过应用自动化或半自动化技术来识别已发表文献中的证据,在这种逻辑前提之间建立联系称为基于文献的关联发现。尽管基于文献的关联发现的潜力已得到广泛认可,但仍存在一些关键问题。首先,这些方法主要是探索性的,并且依赖于启发式方法。其次,这些方法利用了关键字的注释或文本中仅单词(共)出现的注释,最多考虑了单词的变体和同义词。第三,它们未充分利用或忽略可用资源,例如在线本体或叙词表。最后,这些方法缺乏广泛的评估,因此缺乏很强的有效性。这项研究试图解决这些关键问题。针对基因疾病关联作为测试平台,我们提出了一种基于贝叶斯推理网络的建模方法,其中基因和疾病被表示为节点,并通过两种中间节点连接,即基因功能和表型。为了估计模型中涉及的概率,比较了两个学习框架。第一个框架使用一套已建立的基因-疾病关联。第二个框架采用了较大但较弱的一组基因表型关联。此外,还提出了两种学习技术,一种是使用关键字的共同注释的基线方案,另一种是利用在线自由文本信息。我们建议的方法是根据根据实际数据创建的基准数据集进行评估的。评估表明,(a)所提出的方法对于预测特定的基因-疾病关联是有效的;(b)使用自由文本信息始终比使用关键字注释产生更好的性能;(c)从全文文档中获取的信息与仅从摘要中提取的信息相比,它可以显着改善发现,并且(d)可以利用域本体来增强基因功能与表型之间关联的发现。

著录项

  • 作者

    Seki, Kazuhiro.;

  • 作者单位

    Indiana University.;

  • 授予单位 Indiana University.;
  • 学科 Health Sciences Pathology.; Information Science.; Computer Science.
  • 学位 Ph.D.
  • 年度 2006
  • 页码 140 p.
  • 总页数 140
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-17 11:39:57

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号