首页> 外文期刊>In silico biology: An international on computational biology >Composition-Sensitive Analysis of the Human Genome for Regulatory Signals
【24h】

Composition-Sensitive Analysis of the Human Genome for Regulatory Signals

机译:对人类基因组的调控信号进行组成敏感分析

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Known transcription regulatory signals which generally act as transcription factor binding sites (TFs) differ significantly in their base composition. Therefore, their occurrence in a genome largely depends on the local base composition. In an attempt to initiate an all human genome analysis for the occurrence of potential TFs, we systematically analyzed the GC-content of distinct functional regions (e.g., upstream and downstream gene regions, exons, long and short introns, repetitive elements) and correlated the frequencies of potential binding sites of a representative set of TFs in these regions. For these analyses, we used the pattern collection of the TRANSFAC data-base on transcriptional regulation, the information about functionally relevant combinations of them from the data-base TRANSCompel, and our new resource, TRANSGenome~(TM), which provides an overall annotation of the human genome with emphasis on its regulatory characteristics. We show that the occurrence of sequence patterns with regulatory potential may be supported by, but cannot be fully explained by either the GC content of a whole chromosome or its putative promoter regions, nor by the information content of the patterns. Several patterns, HNF-3, NFAT, and GC box, show a clear overexpresentation in all promoter groups as well as in all chromosomes. Other patterns, like E2F and CRE-BP1, are underrepresented in all promoter groups as well as in all chromosomes in comparison with random sequences. Simultaneously, both patterns are over-represented in promoters in comparison with repetitive elements. We define several structural characteristics of the proximal promoters that differentiate them from other functional genomic regions. Two well-known promoter elements, GC- and TATA-boxes, are statistically enriched in promoters in comparison with random sequences, repetitive elements and exons. Altogether, our findings provide insights into the macroheterogeneity amongst the individual chromosomes, into the microheterogeneity among different functional regions of individual chromosomes, contribute to further understanding of structural organization of gene regulatory regions, and give first hints on the development of regulatory features during evolution.
机译:通常充当转录因子结合位点 (TF) 的已知转录调控信号在其碱基组成上有显着差异。因此,它们在基因组中的出现很大程度上取决于局部碱基组成。为了对潜在TF的发生进行全人类基因组分析,我们系统地分析了不同功能区域(例如,上游和下游基因区域、外显子、长短内含子、重复元件)的GC含量,并关联了这些区域中一组具有代表性的TF的潜在结合位点的频率。对于这些分析,我们使用了 TRANSFAC 转录调控数据库的模式集合、来自数据库 TRANSCompel 的有关它们功能相关组合的信息,以及我们的新资源 TRANSGenome~(TM),它提供了人类基因组的整体注释,重点是其调控特征。我们表明,具有调控潜力的序列模式的出现可能由整个染色体或其假定的启动子区域的GC含量或模式的信息含量支持,但不能完全解释。HNF-3、NFAT 和 GC box 等几种模式在所有启动子组以及所有染色体中都显示出明显的过度表达。与随机序列相比,其他模式,如 E2F 和 CRE-BP1,在所有启动子组以及所有染色体中的代表性不足。同时,与重复元素相比,这两种模式在发起人中的代表性都过高。我们定义了近端启动子的几个结构特征,这些特征将它们与其他功能基因组区域区分开来。与随机序列、重复元件和外显子相比,两个众所周知的启动子元件 GC 和 TATA 盒在启动子中具有统计学上的富集。总而言之,我们的研究结果提供了对单个染色体间宏观异质性的见解,对单个染色体不同功能区域之间的微异质性提供了见解,有助于进一步理解基因调控区域的结构组织,并为进化过程中调控特征的发展提供了初步线索。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号