...
首页> 外文期刊>BMC Genomics >Molecular pathway identification using biological network-regularized logistic models
【24h】

Molecular pathway identification using biological network-regularized logistic models

机译:使用生物网络规范的Logistic模型进行分子途径鉴定

获取原文
   

获取外文期刊封面封底 >>

       

摘要

BackgroundSelecting genes and pathways indicative of disease is a central problem in computational biology. This problem is especially challenging when parsing multi-dimensional genomic data. A number of tools, such as L 1-norm based regularization and its extensions elastic net and fused lasso, have been introduced to deal with this challenge. However, these approaches tend to ignore the vast amount of a priori biological network information curated in the literature.ResultsWe propose the use of graph Laplacian regularized logistic regression to integrate biological networks into disease classification and pathway association problems. Simulation studies demonstrate that the performance of the proposed algorithm is superior to elastic net and lasso analyses. Utility of this algorithm is also validated by its ability to reliably differentiate breast cancer subtypes using a large breast cancer dataset recently generated by the Cancer Genome Atlas (TCGA) consortium. Many of the protein-protein interaction modules identified by our approach are further supported by evidence published in the literature. Source code of the proposed algorithm is freely available at http://www.github.com/zhandong/Logit-Lapnet.ConclusionLogistic regression with graph Laplacian regularization is an effective algorithm for identifying key pathways and modules associated with disease subtypes. With the rapid expansion of our knowledge of biological regulatory networks, this approach will become more accurate and increasingly useful for mining transcriptomic, epi-genomic, and other types of genome wide association studies.
机译:背景技术选择指示疾病的基因和途径是计算生物学中的中心问题。解析多维基因组数据时,此问题尤其具有挑战性。为了应对这一挑战,已经引入了许多工具,例如基于L 1 -norm的正则化及其扩展弹性网和融合套索。然而,这些方法往往会忽略文献中提到的大量先验生物网络信息。结果我们建议使用图拉普拉斯正则化逻辑回归将生物网络整合到疾病分类和途径关联问题中。仿真研究表明,该算法的性能优于弹性网和套索分析。使用最近由癌症基因组图谱(TCGA)联盟生成的大型乳腺癌数据集,该算法能够可靠地区分乳腺癌亚型的能力也得到了验证。通过我们的方法确定的许多蛋白质-蛋白质相互作用模块得到了文献中发表的证据的进一步支持。所提出算法的源代码可在http://www.github.com/zhandong/Logit-Lapnet上免费获得。结论带图拉普拉斯正则化的逻辑回归是一种用于识别与疾病亚型相关的关键途径和模块的有效算法。随着我们对生物调控网络知识的迅速扩展,这种方法将变得更加准确,并且在挖掘转录组学,表观基因组学和其他类型的全基因组关联研究中将越来越有用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号