首页> 美国卫生研究院文献>Nucleic Acids Research >Bayesian Markov models consistently outperform PWMs at predicting motifs in nucleotide sequences
【2h】

Bayesian Markov models consistently outperform PWMs at predicting motifs in nucleotide sequences

机译:贝叶斯马尔可夫模型在预测核苷酸序列的基序方面始终优于PWM

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Position weight matrices (PWMs) are the standard model for DNA and RNA regulatory motifs. In PWMs nucleotide probabilities are independent of nucleotides at other positions. Models that account for dependencies need many parameters and are prone to overfitting. We have developed a Bayesian approach for motif discovery using Markov models in which conditional probabilities of order k − 1 act as priors for those of order k. This Bayesian Markov model (BaMM) training automatically adapts model complexity to the amount of available data. We also derive an EM algorithm for de-novo discovery of enriched motifs. For transcription factor binding, BaMMs achieve significantly (P    =  1/16) higher cross-validated partial AUC than PWMs in 97% of 446 ChIP-seq ENCODE datasets and improve performance by 36% on average. BaMMs also learn complex multipartite motifs, improving predictions of transcription start sites, polyadenylation sites, bacterial pause sites, and RNA binding sites by 26–101%. BaMMs never performed worse than PWMs. These robust improvements argue in favour of generally replacing PWMs by BaMMs.
机译:位置权重矩阵(PWM)是DNA和RNA调节基序的标准模型。在PWM中,核苷酸概率与其他位置的核苷酸无关。解释依赖性的模型需要许多参数,并且容易过度拟合。我们已经开发了使用马尔可夫模型进行主题发现的贝叶斯方法,其中k-1阶的条件概率充当k阶的条件概率。贝叶斯马尔可夫模型(BaMM)训练会自动使模型复杂度适应可用数据量。我们还推导了用于丰富发现图案的新颖发现的EM算法。对于转录因子结合,BaMM在446个ChIP-seq ENCODE数据集中的97%中,交叉验证的部分AUC比PWM显着更高(P = 1/16),并且平均性能提高了36%。 BaMM还学习复杂的多部分基序,将转录起始位点,聚腺苷酸化位点,细菌停顿位点和RNA结合位点的预测提高了26-101%。 BaMM的性能从未比PWM差。这些强大的改进表明,一般都可以用BaMM代替PWM。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号