【24h】

Modeling Dependencies in Protein-DNA Binding Sites

机译:蛋白质DNA结合位点的建模依赖性

获取原文

摘要

The availability of whole genome sequences and high-throughput genomic assays opens the door for in silica analysis of transcription regulation. This includes methods for discovering and characterizing the binding sites of DNA-binding proteins, such as transcription factors. A common representation of transcription factor binding sites is a position specific score matrix (PSSM). This representation makes the strong assumption that binding site positions are independent of each other. In this work, weexplore Bayesian network representations of binding sites that provide different tradeoffs between complexity (number of parameters) and the richness of dependencies between positions. We develop the formal machinery for learning such models from data and for estimating the statistical significance of putative binding sites. We then evaluate the ramifications of these richer representations in characterizing binding site motifs and predicting their genomic locations. We show that these richer representations improve over the PSSM model in both tasks.
机译:全基因组序列和高通量基因组测定的可用性为转录调节的二氧化硅分析打开门。这包括发现和表征DNA结合蛋白的结合位点的方法,例如转录因子。转录因子结合位点的常见表示是位置特异性得分矩阵(PSSM)。该表示使得绑定站点位置彼此独立的强烈假设。在这项工作中,Weexplore Bayesian网络表示在复杂性(参数数量)和位置之间的依赖性之间提供不同的权衡。我们开发正规机械,用于从数据学习此类模型,并估算推定的结合位点的统计学意义。然后,我们评估这些富裕表示的后果在表征结合位点基序并预测其基因组位置。我们表明,这些更丰富的表示在两个任务中的PSSM模型上都会改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号