首页> 外文学位 >Machine Learning Approaches for Network-based Prediction of Disease Outcomes and Protein Functions.
【24h】

Machine Learning Approaches for Network-based Prediction of Disease Outcomes and Protein Functions.

机译:基于网络的疾病结果和蛋白质功能预测的机器学习方法。

获取原文
获取原文并翻译 | 示例

摘要

In this dissertation, we address two major problems in bioinformatics: disease outcome prediction and protein function prediction. To solve the issues, we develop network-based machine learning frameworks.;For both machine learning frameworks, we present detailed empirical studies of our algorithms and compare them with state-of-art techniques on gene and protein networks. We show significant performance gains in comparison with other state-of-the-art techniques.;For predicting sample disease status, the statistical and computational challenge to construct such a predictor is that thousands of genes can be used to predict for the disease categories, but only a small number of samples are available. We propose a gene network modular-based linear discriminant analysis approach by integrating `essential' correlation structure among genes into the predictor in order that the modules or cluster structures of genes, which are related to the diagnostic classes we look for, can have potential biological interpretation. For predicting protein functions, we devise a relaxation labeling procedure to find its maximally likely labeling. We also address the problem of multi-label classification of protein functions by taking the relationship of gene ontology terms into account. Our algorithms have significantly advanced the state-of-the-art computational methods for functional characterization of proteins using the integrated function association network.
机译:在本文中,我们解决了生物信息学中的两个主要问题:疾病结果预测和蛋白质功能预测。为了解决这些问题,我们开发了基于网络的机器学习框架。对于这两种机器学习框架,我们都对算法进行了详细的实证研究,并将它们与基因和蛋白质网络上的最新技术进行了比较。与其他最新技术相比,我们显示出显着的性能提升。;要预测样本疾病状况,构建此类预测因子的统计和计算挑战在于成千上万的基因可用于预测疾病类别,但只有少量样本可用。我们提出了一种基于基因网络模块化的线性判别分析方法,通过将基因之间的“基本”相关结构整合到预测变量中,从而使与我们寻找的诊断类别相关的基因模块或簇结构具有潜在的生物学意义。解释。为了预测蛋白质功能,我们设计了一种松弛标记程序以找到其最大可能的标记。我们还通过考虑基因本体术语之间的关系来解决蛋白质功能的多标签分类问题。我们的算法已大大提高了使用集成功能关联网络对蛋白质进行功能表征的最新计算方法。

著录项

  • 作者

    Hu, Pingzhao.;

  • 作者单位

    York University (Canada).;

  • 授予单位 York University (Canada).;
  • 学科 Computer Science.;Biology Biostatistics.;Biology Bioinformatics.
  • 学位 Ph.D.
  • 年度 2012
  • 页码 140 p.
  • 总页数 140
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号