首页> 外文期刊>Journal of Bioinformatics and Computational Biology >Extending hidden Markov models to allow conditioning on previous observations
【24h】

Extending hidden Markov models to allow conditioning on previous observations

机译:扩展隐藏的马尔可夫模型以允许在先前的观察中调节

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Hidden Markov Models (HMMs) are probabilistic models widely used in computational molecular biology. However, the Markovian assumption regarding transition probabilities which dictates that the observed symbol depends only on the current state may not be sufficient for some biological problems. In order to overcome the limitations of the first order HMM, a number of extensions have been proposed in the literature to incorporate past information in HMMs conditioning either on the hidden states, or on the observations, or both. Here, we implement a simple extension of the standard HMM in which the current observed symbol (amino acid residue) depends both on the current state and on a series of observed previous symbols. The major advantage of the method is the simplicity in the implementation, which is achieved by properly transforming the observation sequence, using an extended alphabet. Thus, it can utilize all the available algorithms for the training and decoding of HMMs. We investigated the use of several encoding schemes and performed tests in a number of important biological problems previously studied by our team (prediction of transmembrane proteins and prediction of signal peptides). The evaluation shows that, when enough data are available, the performance increased by 1.8%-8.2% and the existing prediction methods may improve using this approach. The methods, for which the improvement was significant (PRED-TMBB2, PRED-TAT and HMM-TM), are available as web-servers freely accessible to academic users at www.compgen.org/tools/.
机译:隐藏的马尔可夫模型(HMMS)是在计算分子生物学中广泛应用的概率模型。然而,关于过渡概率的马尔科夫假设,其指示观察到的符号仅取决于当前状态可能不足以用于一些生物问题。为了克服第一阶HMM的局限性,在文献中提出了许多扩展,以将过去的信息纳入HMMS调理中,或者在观察中或两者上。这里,我们实施标准HMM的简单延伸,其中当前观察到的符号(氨基酸残基)取决于当前状态和一系列观察到的先前符号。该方法的主要优点是实现中的简单性,其通过使用扩展字母表通过适当地转换观察序列来实现。因此,它可以利用所有可用的算法来训练和解码HMMS。我们调查了使用多种编码方案并在我们的团队研究的许多重要生物问题中进行测试(预测跨膜蛋白和信号肽预测)。评估表明,当有足够的数据时,性能提高1.8%-8.2%,现有的预测方法可以使用这种方法改善。改进的方法(Pred-TMBB2,Pred-TAT和HMM-TM)可用作WWW.comPGEN.ORG/TOOLS/自由访问的Web服务器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号