首页> 外文期刊>Nucleic Acids Research >Discovery and validation of information theory- based transcription factor and cofactor binding site motifs
【24h】

Discovery and validation of information theory- based transcription factor and cofactor binding site motifs

机译:基于信息理论的转录因子和Cofactor结合位点图案的发现与验证

获取原文
获取原文并翻译 | 示例
       

摘要

Data from ChIP-seq experiments can derive the genome-wide binding specificities of transcription factors (TFs) and other regulatory proteins. We analyzed 765 ENCODE ChIP-seq peak datasets of 207 human TFs with a novel motif discovery pipeline based on recursive, thresholded entropy minimization. This approach, while obviating the need to compensate for skewed nucleotide composition, distinguishes true binding motifs from noise, quantifies the strengths of individual binding sites based on computed affinity and detects adjacent cofactor binding sites that coordinate with the targets of primary, immunoprecipitated TFs. We obtained contiguous and bipartite information theory-based position weight matrices (iPWMs) for 93 sequence-specific TFs, discovered 23 cofactor motifs for 127 TFs and revealed six high-confidence novel motifs. The reliability and accuracy of these iPWMs were determined via four independent validation methods, including the detection of experimentally proven binding sites, explanation of effects of characterized SNPs, comparison with previously published motifs and statistical analyses. We also predict previously unreported TF coregulatory interactions (e.g. TF complexes). These iPWMs constitute a powerful tool for predicting the effects of sequence variants in known binding sites, performing mutation analysis on regulatory SNPs and predicting previously unrecognized binding sites and target genes.
机译:来自芯片-SEQ实验的数据可以导出转录因子(TFS)和其他调节蛋白的基因组覆盖特异性。通过基于递归的阈值熵最小化分析了207个人TFS的765编码芯片-SEQ-SEQ-SEQ峰值数据集,具有新颖的主题发现管道。这种方法,同时避免补偿偏孔核苷酸组合物的需要区分真正的结合基序与噪声,基于计算的亲和力来定量个体结合位点的强度,并检测与初级免疫沉淀的TF的靶标进行坐标的相邻的Cofactor结合位点。我们获得连续的和两方的信息理论为基础的位置权重矩阵(iPWMs)为93序列特异性转录因子,发现了127点的TF 23个辅因子基序和显示六个高置信新颖基序。这些IPWM的可靠性和准确性通过四种独立的验证方法确定,包括检测实验证明的结合位点,其特征SNP的效果的说明,与先前公布的图案和统计分析进行比较。我们还预测先前未报告的TF Coregulatory相互作用(例如TF复合物)。这些IPWM构成了预测已知结合位点中序列变体的效果的强大工具,对调节SNP进行突变分析并预测先前未被识别的结合位点和靶基因。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号