...
首页> 外文期刊>Bioinformatics >Module-based prediction approach for robust inter-study predictions in microarray data
【24h】

Module-based prediction approach for robust inter-study predictions in microarray data

机译:基于模块的预测方法,可在微阵列数据中进行可靠的跨研究预测

获取原文
获取原文并翻译 | 示例
           

摘要

Motivation: Traditional genomic prediction models based on individual genes suffer from low reproducibility across microarray studies due to the lack of robustness to expression measurement noise and gene missingness when they are matched across platforms. It is common that some of the genes in the prediction model established in a training study cannot be matched to another test study because a different platform is applied. The failure of interstudy predictions has severely hindered the clinical applications of microarray. To overcome the drawbacks of traditional gene- based prediction ( GBP) models, we propose a module- based prediction ( MBP) strategy via unsupervised gene clustering.Results: K-means clustering is used to group genes sharing similar expression profiles into gene modules, and small modules are merged into their nearest neighbors. Conventional univariate or multivariate feature selection procedure is applied and a representative gene from each selected module is identified to construct the final prediction model. As a result, the prediction model is portable to any test study as long as partial genes in each module exist in the test study. We demonstrate that K-means cluster sizes generally follow a multinomial distribution and the failure probability of inter-study prediction due to missing genes is diminished by merging small clusters into their nearest neighbors. By simulation and applications of real datasets in inter-study predictions, we show that the proposed MBP provides slightly improved accuracy while is considerably more robust than traditional GBP.
机译:动机:传统的基于单个基因的基因组预测模型在微阵列研究中的可重复性较低,这是由于在跨平台匹配时缺乏表达测量噪声和基因缺失的鲁棒性。通常,由于应用了不同的平台,在训练研究中建立的预测模型中的某些基因无法与其他测试研究匹配。研究间预测的失败严重阻碍了微阵列的临床应用。为了克服传统的基于基因的预测(GBP)模型的弊端,我们提出了一种通过无监督基因聚类的基于模块的预测(MBP)策略。结果:K-均值聚类用于将共享相似表达谱的基因分为基因模块,小模块合并到它们最近的邻居中。应用常规的单变量或多变量特征选择程序,并从每个选择的模块中鉴定出代表性基因,以构建最终的预测模型。结果,只要测试研究中每个模块中都存在部分基因,该预测模型便可以移植到任何测试研究中。我们证明,K均值聚类大小通常遵循多项式分布,并且由于将小聚类合并到其最近的邻居中而减少了由于缺失基因而导致的研究间预测的失败概率。通过对研究期间的真实数据集进行仿真和应用,我们表明,所提出的MBP提供了略微提高的准确性,同时比传统GBP更加健壮。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号