首页> 外文会议> >A hierarchical mixture of Markov models for finding biologically active metabolic paths using gene expression and protein classes
【24h】

A hierarchical mixture of Markov models for finding biologically active metabolic paths using gene expression and protein classes

机译:马尔可夫模型的分层混合物,用于使用基因表达和蛋白质类别寻找生物活性代谢途径

获取原文

摘要

With the recent development of experimental high-throughput techniques, the type and volume of accumulating biological data have extremely increased these few years. Mining from different types of data might lead us to find new biological insights. We present a new methodology for systematically combining three different datasets to find biologically active metabolic paths/patterns. This method consists of two steps: first it synthesizes metabolic paths from a given set of chemical reactions, which are already known and whose enzymes are co-expressed, in an efficient manner. It then represents the obtained metabolic paths in a more comprehensible way through estimating parameters of a probabilistic model by using these synthesized paths. This model is built upon an assumption that an entire set of chemical reactions corresponds to a Markov state transition diagram. Furthermore, this model is a hierarchical latent variable model, containing a set of protein classes as a latent variable, for clustering input paths in terms of existing knowledge of protein classes. We tested the performance of our method using a main pathway of glycolysis, and found that our method achieved higher predictive performance for the issue of classifying gene expressions than those obtained by other unsupervised methods. We further analyzed the estimated parameters of our probabilistic models, and found that biologically active paths were clustered into only two or three patterns for each expression experiment type, and each pattern suggested some new long-range relations in the glycolysis pathway.
机译:随着实验性高通量技术的最新发展,近年来积累的生物学数据的类型和数量都大大增加了。从不同类型的数据中进行挖掘可能会导致我们找到新的生物学见解。我们提出了一种新的方法,用于系统地组合三个不同的数据集,以找到具有生物活性的代谢途径/模式。该方法包括两个步骤:首先,它以给定的一组化学反应合成代谢路径,这些化学反应是已知的并且其酶以有效方式共表达。然后,通过使用这些合成路径估计概率模型的参数,以更易理解的方式表示获得的代谢路径。该模型基于以下假设:整个化学反应集对应于一个马尔可夫状态转移图。此外,此模型是分层的潜在变量模型,其中包含一组蛋白质类作为潜在变量,用于根据蛋白质类的现有知识对输入路径进行聚类。我们使用糖酵解的主要途径测试了我们方法的性能,发现与其他无监督方法获得的结果相比,我们的方法在分类基因表达方面实现了更高的预测性能。我们进一步分析了概率模型的估计参数,发现对于每种表达实验类型,生物活性路径仅被分为两个或三个模式,并且每个模式都暗示了糖酵解途径中的一些新的长期关系。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号