首页> 外文期刊>Quantitative biology >Elastic restricted Boltzmann machines for cancer data analysis
【24h】

Elastic restricted Boltzmann machines for cancer data analysis

机译:弹性受限玻尔兹曼机用于癌症数据分析

获取原文
获取原文并翻译 | 示例
       

摘要

Background: Restricted Boltzmann machines (RBMs) are endowed with the universal power of modeling (binary) joint distributions. Meanwhile, as a result of their confining network structure, training RBMs confronts less difficulties when dealing with approximation and inference issues. But little work has been developed to fully exploit the capacity of these models to analyze cancer data, e.g., cancer genomic, transcriptomic, proteomic and epigenomic data. On the other hand, in the cancer data analysis task, the number of features/predictors is usually much larger than the sample size, which is known as the "p 》 N" problem and is also ubiquitous in other bioinformatics and computational biology fields. The "p 》 N" problem puts the bias-variance trade-off in a more crucial place when designing statistical learning methods. However, to date, few RBM models have been particularly designed to address this issue. Methods: We propose a novel RBMs model, called elastic restricted Boltzmann machines (eRBMs), which incorporates the elastic regularization term into the likelihood function, to balance the model complexity and sensitivity. Facilitated by the classic contrastive divergence (CD) algorithm, we develop the elastic contrastive divergence (eCD) algorithm which can train eRBMs efficiently. Results: We obtain several theoretical results on the rationality and properties of our model. We further evaluate the power of our model based on a challenging task - predicting dichotomized survival time using the molecular profiling of tumors. The test results show that the prediction performance of eRBMs is much superior to that of the state-of-the-art methods. Conclusions: The proposed eRBMs are capable of dealing with the "P 》 N" problems and have superior modeling performance over traditional methods. Our novel model is a promising method for future cancer data analysis.
机译:背景:受限的Boltzmann机器(RBM)具有建模(二进制)联合分布的通用功能。同时,由于其受限的网络结构,训练的RBM在处理近似和推理问题时所面临的困难也较小。但是,为了充分利用这些模型来分析癌症数据(例如癌症基因组,转录组学,蛋白质组学和表观基因组数据)的能力,开展的工作很少。另一方面,在癌症数据分析任务中,特征/预测因子的数量通常比样本数量大得多,这被称为“ pⅠN”问题,在其他生物信息学和计算生物学领域中也普遍存在。在设计统计学习方法时,“ p》 N”问题将偏差方差的权衡置于更为关键的位置。但是,迄今为止,很少有专门针对该问题设计的RBM模型。方法:我们提出了一种新颖的RBM模型,称为弹性受限Boltzmann机器(eRBMs),该模型将弹性正则项合并到似然函数中,以平衡模型的复杂性和敏感性。在经典的对比散度(CD)算法的推动下,我们开发了可以有效训练eRBM的弹性对比散度(eCD)算法。结果:关于模型的合理性和性质,我们获得了一些理论结果。我们将基于一项艰巨的任务进一步评估模型的功能-使用肿瘤的分子谱分析预测二分法的生存时间。测试结果表明,eRBM的预测性能远远优于最新方法。结论:所提出的eRBM能够处理“ P” N问题,并且具有优于传统方法的建模性能。我们的新模型是用于未来癌症数据分析的有前途的方法。

著录项

  • 来源
    《Quantitative biology》 |2017年第2期|159-172|共14页
  • 作者单位

    Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China;

    Department of Statistics, University of Wisconsin-Madison, Madison, Wl 53706-1685, USA;

    Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China;

    Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China;

    Bioinformatics Division, TNLIST, Department of Computer Science and Technology, Tsinghua University, Beijing 100084,China;

    Bioinformatics Division, TNLIST, Department of Computer Science and Technology, Tsinghua University, Beijing 100084,China,Program in Computational Biology and Bioinformatics, Univesity of Southern California, Los Angeles, CA 90089, USA;

    Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    RBMs; regularization; cancer data analysis; survival time prediction;

    机译:成果管理制;正规化;癌症数据分析;生存时间预测;
  • 入库时间 2022-08-17 23:18:20

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号