首页> 外文期刊>PLoS Computational Biology >Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast
【24h】

Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast

机译:序列基序,染色质状态和DNA结构特征对酵母中转录因子结合的预测模型的贡献。

获取原文
获取外文期刊封面目录资料

摘要

Transcription factor (TF) binding is determined by the presence of specific sequence motifs (SM) and chromatin accessibility, where the latter is influenced by both chromatin state (CS) and DNA structure (DS) properties. Although SM, CS, and DS have been used to predict TF binding sites, a predictive model that jointly considers CS and DS has not been developed to predict either TF-specific binding or general binding properties of TFs. Using budding yeast as model, we found that machine learning classifiers trained with either CS or DS features alone perform better in predicting TF-specific binding compared to SM-based classifiers. In addition, simultaneously considering CS and DS further improves the accuracy of the TF binding predictions, indicating the highly complementary nature of these two properties. The contributions of SM, CS, and DS features to binding site predictions differ greatly between TFs, allowing TF-specific predictions and potentially reflecting different TF binding mechanisms. In addition, a "TF-agnostic" predictive model based on three DNA “intrinsic properties” (in silico predicted nucleosome occupancy, major groove geometry, and dinucleotide free energy) that can be calculated from genomic sequences alone has performance that rivals the model incorporating experiment-derived data. This intrinsic property model allows prediction of binding regions not only across TFs, but also across DNA-binding domain families with distinct structural folds. Furthermore, these predicted binding regions can help identify TF binding sites that have a significant impact on target gene expression. Because the intrinsic property model allows prediction of binding regions across DNA-binding domain families, it is TF agnostic and likely describes general binding potential of TFs. Thus, our findings suggest that it is feasible to establish a TF agnostic model for identifying functional regulatory regions in potentially any sequenced genome.
机译:转录因子(TF)的结合取决于特定序列基序(SM)和染色质可及性的存在,其中后者受染色质状态(CS)和DNA结构(DS)特性的影响。尽管SM,CS和DS已用于预测TF结合位点,但尚未开发出将CS和DS共同考虑的预测模型来预测TF的TF特异性结合或一般结合特性。使用萌芽酵母作为模型,我们发现与基于SM的分类器相比,仅使用CS或DS功能训练的机器学习分类器在预测TF特异性结合方面表现更好。此外,同时考虑CS和DS进一步提高了TF结合预测的准确性,表明了这两个属性的高度互补性。 TF之间,SM,CS和DS功能对绑定位点预测的贡献差异很大,从而允许TF特定的预测,并可能反映不同的TF绑定机制。此外,仅基于基因组序列即可计算的基于三个DNA“固有特性”(计算机预测的核小体占有率,主要凹槽几何形状和二核苷酸自由能)的“与TF无关”的预测模型具有与该模型相媲美的性能。实验得出的数据。这种内在特性模型不仅可以预测跨TF的结合区域,而且还可以预测具有独特结构折叠的DNA结合域家族的结合区域。此外,这些预测的结合区可以帮助鉴定对靶基因表达有重大影响的TF结合位点。因为内在属性模型允许预测跨DNA结合域家族的结合区域,所以它是TF不可知的,并且可能描述了TF的一般结合潜力。因此,我们的发现表明建立一个TF不可知模型来鉴定潜在的任何测序基因组中的功能性调控区域是可行的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号