首页> 外文期刊>Nucleic Acids Research >NestedMICA: sensitive inference of over-represented motifs in nucleic acid sequence
【24h】

NestedMICA: sensitive inference of over-represented motifs in nucleic acid sequence

机译:NestedMICA:敏感推断核酸序列中过量表达的基序

获取原文
获取原文并翻译 | 示例
           

摘要

NestedMICA is a new, scalable, pattern-discovery system for finding transcription factor binding sites and similar motifs in biological sequences. Like several previous methods, NestedMICA tackles this problem by optimizing a probabilistic mixture model to fit a set of sequences. However, the use of a newly developed inference strategy called Nested Sampling means NestedMICA is able to find optimal solutions without the need for a problematic initialization or seeding step. We investigate the performance of NestedMICA in a range scenario, on synthetic data and a well-characterized set of muscle regulatory regions, and compare it with the popular MEME program. We show that the new method is significantly more sensitive than MEME: in one case, it successfully extracted a target motif from background sequence four times longer than could be handled by the existing program. It also performs robustly on synthetic sequences containing multiple significant motifs. When tested on a real set of regulatory sequences, NestedMICA produced motifs which were good predictors for all five abundant classes of annotated binding sites.
机译:NestedMICA是一种新的,可扩展的,模式发现系统,用于在生物序列中查找转录因子结合位点和类似基序。与以前的几种方法一样,NestedMICA通过优化概率混合模型以适合一组序列来解决此问题。但是,使用称为嵌套采样的最新开发的推理策略意味着NestedMICA能够找到最佳解决方案,而无需进行有问题的初始化或播种步骤。我们根据合成数据和一组特征明确的肌肉调节区域,研究了NestedMICA在一定范围内的性能,并将其与流行的MEME程序进行了比较。我们证明了该新方法比MEME更为灵敏:在一种情况下,它成功地从背景序列中提取了目标基序,是现有程序所能处理的目标时间的四倍。它在包含多个重要基序的合成序列上也表现出色。当在一组真实的调控序列上进行测试时,NestedMICA产生的基序对于所有五类丰富的带注释的结合位点都是很好的预测指标。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号