首页> 外文期刊>IEEE/ACM transactions on computational biology and bioinformatics >A Comparison Study for DNA Motif Modeling on Protein Binding Microarray
【24h】

A Comparison Study for DNA Motif Modeling on Protein Binding Microarray

机译:在蛋白质结合微阵列上进行DNA基序建模的比较研究

获取原文
获取原文并翻译 | 示例
       

摘要

Transcription factor binding sites (TFBSs) are relatively short (5-15 bp) and degenerate. Identifying them is a computationally challenging task. In particular, protein binding microarray (PBM) is a high-throughput platform that can measure the DNA binding preference of a protein in a comprehensive and unbiased manner; for instance, a typical PBM experiment can measure binding signal intensities of a protein to all possible DNA k-mers (10). Since proteins can often bind to DNA with different binding intensities, one of the major challenges is to build TFBS (also known as DNA motif) models which can fully capture the quantitative binding affinity data. To learn DNA motif models from the non-convex objective function landscape, several optimization methods are compared and applied to the PBM motif model building problem. In particular, representative methods from different optimization paradigms have been chosen for modeling performance comparison on hundreds of PBM datasets. The results suggest that the multimodal optimization methods are very effective for capturing the binding preference information from PBM data. In particular, we observe a general performance improvement if choosing di-nucleotide modeling over mono-nucleotide modeling. In addition, the models learned by the best-performing method are applied to two independent applications: PBM probe rotation testing and ChIP-Seq peak sequence prediction, demonstrating its biological applicability.
机译:转录因子结合位点(TFBSs)相对较短(5-15 bp),并且简并。识别它们是一项计算难题。特别是,蛋白质结合微阵列(PBM)是一个高通量平台,可以全面,公正地测量蛋白质的DNA结合偏好。例如,典型的PBM实验可以测量蛋白质与所有可能的DNA k-mers的结合信号强度(10)。由于蛋白质通常可以以不同的结合强度结合到DNA,因此主要的挑战之一是建立可以完全捕获定量结合亲和力数据的TFBS(也称为DNA基序)模型。为了从非凸目标函数域学习DNA主题模型,比较了几种优化方法并将其应用于PBM主题模型构建问题。特别是,已经选择了来自不同优化范例的代表性方法来对数百个PBM数据集的性能比较进行建模。结果表明,多峰优化方法对于从PBM数据捕获绑定偏好信息非常有效。特别是,如果选择二核苷酸建模而不是单核苷酸建模,我们会观察到总体性能的提高。此外,通过最佳性能方法学习的模型还应用于两个独立的应用程序:PBM探针旋转测试和ChIP-Seq峰序列预测,证明了其生物学适用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号