...
首页> 外文期刊>BMC Bioinformatics >Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects
【24h】

Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects

机译:使用具有自回归随机效应的普通混合模型对时程基因表达谱进行聚类

获取原文

摘要

Background Time-course gene expression data such as yeast cell cycle data may be periodically expressed. To cluster such data, currently used Fourier series approximations of periodic gene expressions have been found not to be sufficiently adequate to model the complexity of the time-course data, partly due to their ignoring the dependence between the expression measurements over time and the correlation among gene expression profiles. We further investigate the advantages and limitations of available models in the literature and propose a new mixture model with autoregressive random effects of the first order for the clustering of time-course gene-expression profiles. Some simulations and real examples are given to demonstrate the usefulness of the proposed models. Results We illustrate the applicability of our new model using synthetic and real time-course datasets. We show that our model outperforms existing models to provide more reliable and robust clustering of time-course data. Our model provides superior results when genetic profiles are correlated. It also gives comparable results when the correlation between the gene profiles is weak. In the applications to real time-course data, relevant clusters of coregulated genes are obtained, which are supported by gene-function annotation databases. Conclusions Our new model under our extension of the EMMIX-WIRE procedure is more reliable and robust for clustering time-course data because it adopts a random effects model that allows for the correlation among observations at different time points. It postulates gene-specific random effects with an autocorrelation variance structure that models coregulation within the clusters. The developed R package is flexible in its specification of the random effects through user-input parameters that enables improved modelling and consequent clustering of time-course data.
机译:背景时间过程基因表达数据例如酵母细胞周期数据可以被周期性地表达。为了对此类数据进行聚类,目前发现周期性基因表达的傅立叶级数逼近不足以对时程数据的复杂性进行建模,这部分是由于它们忽略了表达测量值随时间的依赖性以及之间的相关性。基因表达谱。我们进一步研究文献中可用模型的优点和局限性,并提出一种具有一阶自回归随机效应的时程基因表达谱聚类的新混合模型。给出了一些仿真和实际例子,以证明所提出模型的有效性。结果我们使用合成和实时课程数据集说明了新模型的适用性。我们表明,我们的模型优于现有模型,可以提供更可靠,更强大的时程数据聚类。当遗传图谱相关时,我们的模型提供了优异的结果。当基因图谱之间的相关性较弱时,它也会给出可比较的结果。在实时过程数据的应用中,获得了相关的成簇基因的簇,这些簇得到了基因功能注释数据库的支持。结论在扩展EMMIX-WIRE程序的情况下,我们的新模型对于时程数据的聚类更加可靠和健壮,因为它采用了随机效应模型,可以在不同时间点进行观测之间的相关性。它利用自相关方差结构来推测基因特异性随机效应,该自相关方差结构可对簇内的共聚进行建模。所开发的R包通过用户输入参数灵活地指定随机效果,从而改进了建模并随后对时程数据进行了聚类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号