首页> 外文会议> >Model-based clustering with genes expression dynamics for time-course gene expression data
【24h】

Model-based clustering with genes expression dynamics for time-course gene expression data

机译:基于模型的聚类,带有时域基因表达数据的基因表达动力学

获取原文

摘要

Microarray technologies are emerging as a promising tool for genomic studies. A huge body of time-course gene expression data has been and will continuously be produced by microarray experiments. Such gene expression data contains important information and has been proven useful in medical diagnosis, treatment, and drug design. The challenge now is how to analyze such data to obtain the inherent information. Cluster analysis has played an important role in analyzing time-course gene expression data. However, most clustering techniques do not take into consideration the inherent time dependence (dynamics) of time-course gene expression patterns. Accounting for the inherent dynamics of such data in cluster analysis should lead to higher quality clustering. This paper presents a model-based clustering method for time-course gene expression data. The presented method uses Markov chain models (MCMs) to account for the inherent dynamics of time-course gene expression patterns and assumes that expression patterns in the same cluster were generated by the same MCM. For the given number of clusters, the presented method computes cluster models using an EM algorithm and an assignment of genes to these models that maximizes their posterior probabilities. Further, this study employs the average adjusted Rand index (AARI) to evaluate the quality of clustering. The improved performance of the presented method is demonstrated by comparing to the k-means method on a publicly available dataset.
机译:微阵列技术是作为基因组研究的有希望的工具。已经通过微阵列实验连续生产了巨大的时间课程基因表达数据。此类基因表达数据包含重要信息,并且已被证明可用于医疗诊断,治疗和药物设计。现在挑战是如何分析这些数据以获得固有信息。聚类分析在分析时间课程基因表达数据方面发挥了重要作用。然而,大多数聚类技术都不考虑时间课程基因表达模式的固有时间依赖性(动态)。占集群分析中这些数据的固有动态的核算应导致更高质量的聚类。本文介绍了一种基于模型的聚类方法,用于时间课程基因表达数据。呈现的方法使用Markov链模型(MCM)来解释时间课程基因表达式模式的固有动态,并假设由相同的MCM生成相同群集中的表达式模式。对于给定的簇数,所示的方法使用EM算法计算群集模型和基因的分配给这些模型,最大化其后部概率。此外,本研究采用平均调整的兰特指数(AARI)来评估聚类的质量。通过与公共可用数据集上的K-Means方法比较来说明所提出的方法的改进性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号