首页> 外文期刊>Bioinformatics >Non-additivity in protein-DNA binding
【24h】

Non-additivity in protein-DNA binding

机译:蛋白质-DNA结合中的非可加性

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: Localizing protein binding sites within genomic DNA is of considerable importance, but remains difficult for protein families, such as transcription factors, which have loosely defined target sequences. It is generally assumed that protein affinity for DNA involves additive contributions from successive nucleotide pairs within the target sequence. This is not necessarily true, and non-additive effects have already been experimentally demonstrated in a small number of cases. The principal origin of non-additivity involves the so-called indirect component of protein-DNA recognition which is related to the sequence dependence of DNA deformation induced during complex formation. Non-additive effects are difficult to study because they require the identification of many more binding sequences than are normally necessary for describing additive specificity (typically via the construction of weight matrices).Results: In the present work we will use theoretically estimated binding energies as a basis for overcoming this problem. Our approach enables us to study the full combinatorial set of sequences for a variety of DNA-binding proteins, make a detailed analysis of non-additive effects and exploit this information to improve binding site predictions using either weight matrices or support vector machines. The results underline the fact that, even in the presence of significant deformation, non-additive effects may involve only a limited number of dinucleotide steps. This information helps to reduce the number of binding sites which need to be identified for successful predictions and to avoid problems of over-fitting.
机译:动机:在基因组DNA中定位蛋白质结合位点非常重要,但对于蛋白质家族(如转录因子)而言,仍然很难定义目标序列,而这些因子如转录因子。通常认为蛋白质对DNA的亲和力涉及目标序列中连续核苷酸对的累加贡献。这不一定是正确的,并且已经在少数情况下通过实验证明了非累加效应。非可加性的主要来源涉及蛋白质-DNA识别的所谓间接成分,这与在复合物形成过程中诱导的DNA变形的序列依赖性有关。非加性效应难以研究,因为它们需要鉴定比描述加成特异性通常所需的结合序列更多的结合序列(通常通过构建重量矩阵)。结果:在本工作中,我们将使用理论上估计的结合能作为解决这个问题的基础。我们的方法使我们能够研究各种DNA结合蛋白的完整组合序列集,对非累加效应进行详细分析,并利用权重矩阵或支持向量机利用此信息来改善结合位点的预测。结果强调了这样一个事实,即使即使存在明显的变形,非累加效应也可能仅涉及有限数量的二核苷酸步骤。此信息有助于减少成功进行预测所需识别的结合位点的数量,并避免过度拟合的问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号