首页> 美国卫生研究院文献>BMC Bioinformatics >HMM-ModE – Improved classification using profile hidden Markov models by optimising the discrimination threshold and modifying emission probabilities with negative training sequences
【2h】

HMM-ModE – Improved classification using profile hidden Markov models by optimising the discrimination threshold and modifying emission probabilities with negative training sequences

机译:HMM-ModE –使用轮廓隐式马尔可夫模型改进分类方法是优化区分阈值并使用负训练序列修改发射概率

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

BackgroundProfile Hidden Markov Models (HMM) are statistical representations of protein families derived from patterns of sequence conservation in multiple alignments and have been used in identifying remote homologues with considerable success. These conservation patterns arise from fold specific signals, shared across multiple families, and function specific signals unique to the families. The availability of sequences pre-classified according to their function permits the use of negative training sequences to improve the specificity of the HMM, both by optimizing the threshold cutoff and by modifying emission probabilities to minimize the influence of fold-specific signals. A protocol to generate family specific HMMs is described that first constructs a profile HMM from an alignment of the family's sequences and then uses this model to identify sequences belonging to other classes that score above the default threshold (false positives). Ten-fold cross validation is used to optimise the discrimination threshold score for the model. The advent of fast multiple alignment methods enables the use of the profile alignments to align the true and false positive sequences, and the resulting alignments are used to modify the emission probabilities in the original model.
机译:BackgroundProfile隐马尔可夫模型(HMM)是蛋白质家族的统计表示形式,其来源于多个比对中的序列保守模式,并已用于鉴定遥远的同源物,并取得了相当大的成功。这些保护模式来自特定折叠的信号,在多个家族之间共享,并且具有特定于家族的特定信号。根据其功能预先分类的序列的可用性允许使用负训练序列来改善HMM的特异性,这既可以通过优化阈值截止值,也可以通过修改发射概率以最小化倍数特异性信号的影响。描述了一种生成家族特异性HMM的协议,该协议首先根据家族序列的比对构建一个概要HMM,然后使用该模型来识别属于其他类别的,得分高于默认阈值(假阳性)的序列。十倍交叉验证用于优化模型的判别阈值得分。快速多重比对方法的出现使得能够使用轮廓比对来比对真实和假阳性序列,并且所得到的比对用于修改原始模型中的发射概率。

著录项

相似文献

  • 外文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号