...
首页> 外文期刊>BMC Genomics >De novo prediction of cis -regulatory elements and modules through integrative analysis of a large number of ChIP datasets
【24h】

De novo prediction of cis -regulatory elements and modules through integrative analysis of a large number of ChIP datasets

机译:通过对大量ChIP数据集进行综合分析从头预测顺式调控元件和模块

获取原文

摘要

In eukaryotes, transcriptional regulation is usually mediated by interactions of multiple transcription factors (TFs) with their respective specific cis-regulatory elements (CREs) in the so-called cis-regulatory modules (CRMs) in DNA. Although the knowledge of CREs and CRMs in a genome is crucial to elucidate gene regulatory networks and understand many important biological phenomena, little is known about the CREs and CRMs in most eukaryotic genomes due to the difficulty to characterize them by either computational or traditional experimental methods. However, the exponentially increasing number of TF binding location data produced by the recent wide adaptation of chromatin immunoprecipitation coupled with microarray hybridization (ChIP-chip) or high-throughput sequencing (ChIP-seq) technologies has provided an unprecedented opportunity to identify CRMs and CREs in genomes. Nonetheless, how to effectively mine these large volumes of ChIP data to identify CREs and CRMs at nucleotide resolution is a highly challenging task. We have developed a novel graph-theoretic based algorithm DePCRM for genome-wide de novo predictions of CREs and CRMs using a large number of ChIP datasets. DePCRM predicts CREs and CRMs by identifying overrepresented combinatorial CRE motif patterns in multiple ChIP datasets in an effective way. When applied to 168 ChIP datasets of 56 TFs from D. melanogaster, DePCRM identified 184 and 746 overrepresented CRE motifs and their combinatorial patterns, respectively, and predicted a total of 115,932 CRMs in the genome. The predictions recover 77.9% of known CRMs in the datasets and 89.3% of known CRMs containing at least one predicted CRE. We found that the putative CRMs as well as CREs as a whole in a CRM are more conserved than randomly selected sequences. Our results suggest that the CRMs predicted by DePCRM are highly likely to be functional. Our algorithm is the first of its kind for de novo genome-wide prediction of CREs and CRMs using larger number of transcription factor ChIP datasets. The algorithm and predictions will hopefully facilitate the elucidation of gene regulatory networks in eukaryotes. All the predicted CREs, CRMs, and their target genes are available at http://bioinfo.uncc.edu/mniu/pcrms/www/ .
机译:在真核生物中,转录调控通常是由DNA中所谓的顺式调控模块(CRM)中的多个转录因子(TF)与它们各自的特定顺式调控元件(CRE)相互作用而介导的。尽管了解基因组中CRE和CRM的重要性对于阐明基因调控网络和理解许多重要的生物学现象至关重要,但是由于大多数真核基因组中的CRE和CRM难以通过计算或传统实验方法表征,因此对其了解甚少。然而,最近广泛采用的染色质免疫沉淀技术与微阵列杂交(ChIP芯片)或高通量测序(ChIP-seq)技术相结合,所产生的TF结合位置数据呈指数级增长,为鉴定CRM和CRE提供了前所未有的机会在基因组中。尽管如此,如何有效地挖掘大量的ChIP数据以核苷酸分辨率识别CRE和CRM是一项极富挑战性的任务。我们已经开发了一种基于图论的新颖算法DePCRM,用于使用大量ChIP数据集对CRE和CRM进行全基因组的从头预测。 DePCRM通过以有效方式识别多个ChIP数据集中过度代表的组合CRE主题图案来预测CRE和CRM。当将DePCRM应用于来自黑腹果蝇(D. melanogaster)的56个TF的168个ChIP数据集时,分别鉴定了184个和746个过表达的CRE基序及其组合模式,并预测了基因组中总共115,932个CRM。预测可恢复数据集中已知CRM的77.9%,以及包含至少一个预测CRE的已知CRM的89.3%。我们发现,假定的CRM和CRM中的CRE整体比随机选择的序列更保守。我们的结果表明,DePCRM预测的CRM很有可能起作用。我们的算法是使用大量转录因子ChIP数据集从头进行全基因组范围CRE和CRM的从头预测的算法。该算法和预测将有望促进真核生物基因调控网络的阐明。所有预测的CRE,CRM及其目标基因均可在http://bioinfo.uncc.edu/mniu/pcrms/www/获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号