首页> 外文学位 >Literature mining sustains and enhances knowledge discovery from omic studies
【24h】

Literature mining sustains and enhances knowledge discovery from omic studies

机译:文献挖掘支持并增强了眼科研究中的知识发现

获取原文
获取原文并翻译 | 示例

摘要

Genomic, proteomic and other experimentally generated data from studies of biological systems aiming to discover disease biomarkers are currently analyzed without sufficient supporting evidence from the literature due to complexities associated with automated processing. Extracting prior knowledge about markers associated with biological sample types and disease states from the literature is tedious, and little research has been performed to understand how to use this knowledge to inform the generation of classification models from 'omic' data. Using pathway analysis methods to better understand the underlying biology of complex diseases such as breast and lung cancers is state-of-the-art. However, the problem of how to combine literature-mining evidence with pathway analysis evidence is an open problem in biomedical informatics research.;This dissertation presents a novel semi-automated framework, named Knowledge Enhanced Data Analysis (KEDA), which incorporates the following components: 1) literature mining of text; 2) classification modeling; and 3) pathway analysis. This framework aids researchers in assigning literature-mining-based prior knowledge values to genes and proteins associated with disease biology. It incorporates prior knowledge into the modeling of experimental datasets, enriching the development process with current findings from the scientific community.;New knowledge is presented in the form of lists of known disease-specific biomarkers and their accompanying scores obtained through literature mining of millions of lung and breast cancer abstracts. These scores can subsequently be used as prior knowledge values in Bayesian modeling and pathway analysis. Ranked, newly discovered biomarker-disease-biofluid relationships which identify biomarker specificity across biofluids are presented. A novel method of identifying biomarker relationships is discussed that examines the attributes from the best-performing models. Pathway analysis results from the addition of prior information, ultimately lead to more robust evidence for pathway involvement in diseases of interest based on statistically significant standard measures of impact factor and p-values.;The outcome of implementing the KEDA framework is enhanced modeling and pathway analysis findings. Enhanced knowledge discovery analysis leads to new disease-specific entities and relationships that otherwise would not have been identified. Increased disease understanding, as well as identification of biomarkers for disease diagnosis, treatment, or therapy targets should ultimately lead to validation and clinical implementation.
机译:由于与自动化处理相关的复杂性,目前在缺乏足够文献支持证据的情况下,对来自旨在寻找疾病生物标记物的生物系统研究的基因组,蛋白质组学和其他实验生成的数据进行了分析。从文献中提取与生物样本类型和疾病状态相关的标记的先验知识是乏味的,并且几乎没有进行研究以了解如何使用此知识来从“组学”数据中生成分类模型。使用途径分析方法更好地了解复杂疾病(如乳腺癌和肺癌)的基础生物学是最新技术。然而,如何将文献挖掘证据与途径分析证据相结合的问题是生物医学信息学研究中的一个未解决的问题。本文提出了一种新颖的半自动化框架,即知识增强数据分析(KEDA),该框架包含以下组成部分:1)文献的文献挖掘; 2)分类建模; 3)途径分析。该框架可帮助研究人员为与疾病生物学相关的基因和蛋白质分配基于文献挖掘的先验知识值。它将先验知识整合到实验数据集的建模中,丰富了科学界的最新发现,丰富了开发过程。新知识以已知疾病特异性生物标志物的列表形式以及通过数百万本文献的文献挖掘获得的伴随分数的形式提供肺癌和乳腺癌摘要。这些分数随后可以用作贝叶斯建模和路径分析中的先验知识值。提出了新发现的有序的生物标志物-疾病-生物流体关系,其确定了跨生物流体的生物标志物特异性。讨论了一种识别生物标志物关系的新颖方法,该方法检查了表现最佳的模型的属性。途径分析是通过添加先验信息而得出的,最终基于影响因子和p值的统计学上显着的标准量度,为感兴趣的疾病中的途径参与提供了更可靠的证据。分析结果。增强的知识发现分析将导致新的疾病特定实体和关系,否则这些实体和关系将无法被识别。对疾病的更多了解以及对疾病诊断,治疗或治疗目标的生物标记物的识别最终应导致验证和临床实施。

著录项

  • 作者

    Jordan, Rick Matthew.;

  • 作者单位

    University of Pittsburgh.;

  • 授予单位 University of Pittsburgh.;
  • 学科 Bioinformatics.;Molecular biology.;Oncology.
  • 学位 Ph.D.
  • 年度 2016
  • 页码 250 p.
  • 总页数 250
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-17 11:51:28

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号