首页> 外文会议>International Conference on Intelligent Systems for Molecular biology >Using hidden Markov models to analyze gene expression time course data
【24h】

Using hidden Markov models to analyze gene expression time course data

机译:使用隐马尔可夫模型分析基因表达时间课程数据

获取原文

摘要

Motivation: Cellular processes cause changes over time. Observing and measuring those changes over time allows insights into the how and why of regulation. The experimental platform for doing the appropriate large-scale experiments to obtain time-courses of expression levels is provided by microarray technology. However, the proper way of analyzing the resulting time course data is still very much an issue under investigation. The inherent time dependencies in the data suggest that clustering techniques which reflect those dependencies yield improved performance. Results: We propose to use Hidden Markov Models (HMMs) to account for the horizontal dependencies along the time axis in time course data and to cope with the prevalent errors and missing values. The HMMs are used within a model-based clustering framework. We are given a number of clusters, each represented by one Hidden Markov Model from a finite collection encompassing typical qualitative behavior. Then, our method finds in an iterativeprocedure cluster models and an assignment of data points to these models that maximizes the joint likelihood of clustering and models. Partially supervised learning--adding groups of labeled data to the initial collection of clusters--is supported. A graphical user interface allows quering an expression profile dataset for time course similar to a prototype graphically defined as a sequence of levels and durations. We also propose a heuristic approach to automate determination of the number of clusters. We evaluate the method on published yeast cell cycle and fibroblasts serum response datasets, and compare them, with favorable results, to the autoregressive curves method.
机译:动机:细胞过程导致随时间的变化。观察和测量随着时间的推移,允许洞察法规如何以及原因。用于做适当的大规模实验以获得表达水平时间课程的实验平台是由微阵列技术提供的。然而,分析所产生的时间课程数据的正确方法仍然是在调查下的问题。数据中的固有时间依赖性表明,反映这些依赖性的聚类技术产生了改进的性能。结果:我们建议使用隐马尔可夫模型(HMMS)来沿时间轴上的水平依赖性帐户在时间课程数据中,并应对普遍存在的错误和缺失值。 HMMS用于基于模型的群集框架内。我们被提供了许多集群,每个集群由一个隐藏的马尔可夫模型代表,其中包括典型定性行为的有限收集。然后,我们的方法在迭代前置群集模型中找到了数据点的分配,这些模型最大化聚类和模型的关节可能性。支持部分监督的学习 - 将标记数据添加到初始集群集群中的组 - 得到了支持。图形用户界面允许在图形地定义为一系列级别和持续时间的原型的时间课程中取出表达式配置文件数据集。我们还提出了一种启发式方法来自动确定集群的数量。我们评估发表的酵母细胞周期和成纤维细胞血清响应数据集的方法,并将它们与良好的结果进行比较到自回归曲线方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号