首页> 外文期刊>PLoS Computational Biology >A Graphical Modelling Approach to the Dissection of Highly Correlated Transcription Factor Binding Site Profiles
【24h】

A Graphical Modelling Approach to the Dissection of Highly Correlated Transcription Factor Binding Site Profiles

机译:解剖高度相关的转录因子结合位点图的图形建模方法。

获取原文
           

摘要

Inferring the combinatorial regulatory code of transcription factors (TFs) from genome-wide TF binding profiles is challenging. A major reason is that TF binding profiles significantly overlap and are therefore highly correlated. Clustered occurrence of multiple TFs at genomic sites may arise from chromatin accessibility and local cooperation between TFs, or binding sites may simply appear clustered if the profiles are generated from diverse cell populations. Overlaps in TF binding profiles may also result from measurements taken at closely related time intervals. It is thus of great interest to distinguish TFs that directly regulate gene expression from those that are indirectly associated with gene expression. Graphical models, in particular Bayesian networks, provide a powerful mathematical framework to infer different types of dependencies. However, existing methods do not perform well when the features (here: TF binding profiles) are highly correlated, when their association with the biological outcome is weak, and when the sample size is small. Here, we develop a novel computational method, the Neighbourhood Consistent PC (NCPC) algorithms, which deal with these scenarios much more effectively than existing methods do. We further present a novel graphical representation, the Direct Dependence Graph (DDGraph), to better display the complex interactions among variables. NCPC and DDGraph can also be applied to other problems involving highly correlated biological features. Both methods are implemented in the R package ddgraph, available as part of Bioconductor (http://bioconductor.org/packages/2.11/bioc/html/ddgraph.html). Applied to real data, our method identified TFs that specify different classes of cis-regulatory modules (CRMs) in Drosophila mesoderm differentiation. Our analysis also found depletion of the early transcription factor Twist binding at the CRMs regulating expression in visceral and somatic muscle cells at later stages, which suggests a CRM-specific repression mechanism that so far has not been characterised for this class of mesodermal CRMs.
机译:从全基因组的TF结合概况推断转录因子(TFs)的组合调节代码是具有挑战性的。主要原因是TF结合图谱明显重叠,因此高度相关。染色质的可及性和TF之间的局部协作可能会导致多个TF在基因组位点的聚簇出现,或者如果这些图谱是由不同的细胞群体产生的,则结合位点可能会简单地显示为聚簇。 TF结合图谱的重叠也可能是由于在紧密相关的时间间隔进行的测量而导致的。因此,将直接调节基因表达的TF与与基因表达间接相关的TF区分开是非常有意义的。图形模型,特别是贝叶斯网络,提供了强大的数学框架来推断不同类型的依存关系。但是,当特征(此处为:TF结合图谱)高度相关,它们与生物学结果的关联性较弱以及样本量较小时,现有方法效果不佳。在这里,我们开发了一种新颖的计算方法,邻域一致性PC(NCPC)算法,它比现有方法更有效地处理这些情况。我们进一步提出一种新颖的图形表示形式,即直接依赖图(DDGraph),以更好地显示变量之间的复杂交互。 NCPC和DDGraph也可以应用于涉及高度相关的生物学特征的其他问题。两种方法都在R包ddgraph中实现,可以作为Bioconductor的一部分获得(http://bioconductor.org/packages/2.11/bioc/html/ddgraph.html)。应用到实际数据中,我们的方法确定了在果蝇中胚层分化中指定不同类别的顺式调控模块(CRM)的TF。我们的分析还发现,在后期调节内脏和体肌细胞中表达的CRM处,早期转录因子Twist结合的耗竭,这表明迄今为止尚无针对此类中胚层CRM的CRM特异性抑制机制。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号