...
首页> 外文期刊>Nucleic acids research >Binding site discovery from nucleic acid sequences by discriminative learning of hidden Markov models
【24h】

Binding site discovery from nucleic acid sequences by discriminative learning of hidden Markov models

机译:通过判别学习隐马尔可夫模型从核酸序列发现结合位点

获取原文
           

摘要

We present a discriminative learning method for pattern discovery of binding sites in nucleic acid sequences based on hidden Markov models. Sets of positive and negative example sequences are mined for sequence motifs whose occurrence frequency varies between the sets. The method offers several objective functions, but we concentrate on mutual information of condition and motif occurrence. We perform a systematic comparison of our method and numerous published motif-finding tools. Our method achieves the highest motif discovery performance, while being faster than most published methods. We present case studies of data from various technologies, including ChIP-Seq, RIP-Chip and PAR-CLIP, of embryonic stem cell transcription factors and of RNA-binding proteins, demonstrating practicality and utility of the method. For the alternative splicing factor RBM10, our analysis finds motifs known to be splicing-relevant. The motif discovery method is implemented in the free software package Discrover. It is applicable to genome- and transcriptome-scale data, makes use of available repeat experiments and aside from binary contrasts also more complex data configurations can be utilized.
机译:我们提出了一种基于隐马尔可夫模型的核酸序列中结合位点的模式发现的判别学习方法。挖掘正例序列和负例序列的集合以寻找序列基序,其出现频率在各组之间变化。该方法提供了几个目标函数,但我们集中于条件和主题发生的相互信息。我们对我们的方法和许多公开的主题发现工具进行了系统的比较。我们的方法实现了最高的主题发现性能,同时比大多数公开的方法要快。我们目前的案例研究来自各种技术的数据,包括ChIP-Seq,RIP-Chip和PAR-CLIP,胚胎干细胞转录因子和RNA结合蛋白,证明了该方法的实用性和实用性。对于替代剪接因子RBM10,我们的分析发现了已知与剪接相关的基序。主题发现方法在免费软件包Discrover中实现。它适用于基因组和转录组规模的数据,利用可用的重复实验,除二进制对比外,还可以利用更复杂的数据配置。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号