...
首页> 外文期刊>Journal of computational biology: A journal of computational molecular cell biology >Integration of 198 ChIP-seq Datasets Reveals Human cis-Regulatory Regions
【24h】

Integration of 198 ChIP-seq Datasets Reveals Human cis-Regulatory Regions

机译:198个ChIP-seq数据集的整合揭示了人类的顺式调控区域

获取原文
获取原文并翻译 | 示例
           

摘要

We analyzed 198 datasets of chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq) and developed a methodology for identification of high-confidence enhancer and promoter regions from transcription factor ChIP-seq data alone. We identify 32,467 genomic regions marked with ChIP-seq binding peaks in 15 or more experiments as high-confidence cis-regulatory regions. Although the selected regions mark only *0.67% of the genome, 70.5% of our predicted binding regions fall within independently identified, strongly expression-correlated and histone-marked enhancer regions, which cover *8% of the genome (Ernst et al., Nature 2011, 473, 43-49). Even more remarkably, 85.6% of our selected regions overlap transcription factor (TF) binding regions identified in evolutionarily conserved DNase1 hypersensitivity cluster regions, which cover 0.75% of the genome (Boyle et al., Genome Research 2011, 21, 456- 464). P-values for these overlaps are effectively zero (Z-scores of 328 and 715 respectively). Furthermore, 62% of our selected regions overlap the intersection of the evolutionarily conserved DNase1 hypersensitivity-identified TF-binding regions of Boyle et al. (2011) with the histone-marked enhancers found to be strongly associated with transcriptional activity by Ernst et al. (2011). Two hundred thirty of our candidate cis-regulatory regions overlap cancer-associated variants reported in the Catalogue of Somatic Mutations in Cancer (http://www.sanger.ac.uk/genetics/CGP/cosmic/). We also identify 1,252 potential proximal promoters for the 7,561 disjoint lincRNA regions currently in the Human lincRNA Catalog (www.broadinstitute.org/genome_bio/human_ lincrnas/). Our investigation used approximately half of all currently available ENCODE ChIP-seq datasets, suggesting further gains are likely from analysis of all datasets currently available.
机译:我们分析了198个染色质免疫沉淀数据集,然后进行了高通量测序(ChIP-seq),并开发了一种仅从转录因子ChIP-seq数据中鉴定高可信度增强子和启动子区域的方法。我们在15个或更多个实验中将标记有ChIP-seq结合峰的32,467个基因组区域识别为高信度顺式调控区域。尽管选择的区域仅标记了基因组的* 0.67%,但我们预测的结合区域的70.5%属于独立鉴定的,高度表达相关和组蛋白标记的增强子区域,覆盖了基因组的* 8%(Ernst等, Nature 2011,473,43-49)。更为显着的是,我们选择的区域的85.6%与在进化保守的DNase1超敏性簇区域中鉴定的转录因子(TF)结合区域重叠,该区域覆盖了基因组的0.75%(Boyle等人,Genome Research 2011,21,456-464) 。这些重叠的P值实际上为零(Z分数分别为328和715)。此外,我们选择的区域中有62%与Boyle等人进化上保守的DNase1超敏性识别的TF结合区域的交点重叠。 (2011年)与组蛋白标记的增强子发现与Ernst等人的转录活性密切相关。 (2011)。我们的候选顺式调控区中有233个与《癌症体细胞突变目录》(http://www.sanger.ac.uk/genetics/CGP/cosmic/)中报道的与癌症相关的变异重叠。我们还为人类lincRNA目录(www.broadinstitute.org/genome_bio/human_ lincrnas /)中的7,561个不相交的lincRNA区确定了1,252个潜在的近端启动子。我们的调查使用了所有当前可用的ENCODE ChIP-seq数据集的大约一半,这表明对当前所有可用数据集的分析可能会进一步获得收益。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号