首页> 外文期刊>Proceedings of the National Academy of Sciences of the United States of America >Systematic discovery of regulatory motifs in conserved regions of the human genome, including thousands of CTCF insulator sites
【24h】

Systematic discovery of regulatory motifs in conserved regions of the human genome, including thousands of CTCF insulator sites

机译:系统地发现人类基因组保守区域中的调控基序,包括数千个CTCF绝缘子位点

获取原文
获取原文并翻译 | 示例
       

摘要

Conserved noncoding elements (CNEs) constitute the majority of sequences under purifying selection in the human genome, yet their function remains largely unknown. Experimental evidence suggests that many of these elements play regulatory roles, but little is known about regulatory motifs contained within them. Here we describe a systematic approach to discover and characterize regulatory motifs within mammalian CNEs by searching for long motifs (12-22 nt) with significant enrichment in CNEs and studying their biochemical and genomic properties. Our analysis identifies 233 long motifs (LMs), matching a total of ≈60,000 conserved instances across the human genome. These motifs include 16 previously known regulatory elements, such as the histone 3'-UTR motif and the neuron-restrictive silencer element, as well as striking examples of novel functional elements. The most highly enriched motif (LM1) corresponds to the X-box motif known from yeast and nematode. We show that it is bound by the RFX1 protein and identify thousands of conserved motif instances, suggesting a broad role for the RFX family in gene regulation. A second group of motifs (LM2~*) does not match any previously known motif. We demonstrate by biochemical and computational methods that it defines a binding site for the CTCF protein, which is involved in insulator function to limit the spread of gene activation. We identify nearly 15,000 conserved sites that likely serve as insulators, and we show that nearby genes separated by predicted CTCF sites show markedly reduced correlation in gene expression. These sites may thus partition the human genome into domains of expression.
机译:保守的非编码元件(CNE)构成了人类基因组中经过纯化选择的大多数序列,但其功能仍然未知。实验证据表明,其中许多元素都起调节作用,但其中所含的调节基序知之甚少。在这里,我们描述了一种系统的方法,可通过搜索具有显着丰富的CNE的长基序(12-22 nt)并研究其生化和基因组特性来发现和表征哺乳动物CNE中的调控基序。我们的分析确定了233个长基元(LM),与整个人类基因组中总共约60,000个保守实例匹配。这些基序包括16个先前已知的调控元件,例如组蛋白3'-UTR基序和神经元限制性沉默子元件,以及新颖的功能元件的醒目示例。高度富集的基序(LM1)对应于从酵母和线虫已知的X-box基序。我们表明它受RFX1蛋白的束缚,并鉴定了数千个保守的基序实例,表明RFX家族在基因调控中具有广泛的作用。第二组主题(LM2〜*)与任何先前已知的主题都不匹配。我们通过生化和计算方法证明,它定义了CTCF蛋白的结合位点,该位点参与绝缘子功能以限制基因激活的扩散。我们发现了将近15,000个保守位点,可能充当了绝缘子,并且我们发现,由预测的CTCF位点分隔的附近基因在基因表达中显示出明显降低的相关性。这些位点因此可以将人类基因组划分成表达域。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号