首页> 美国卫生研究院文献>Genome Research >PRISM offers a comprehensive genomic approach to transcription factor function prediction
【2h】

PRISM offers a comprehensive genomic approach to transcription factor function prediction

机译:PRISM提供了一种全面的基因组方法来预测转录因子功能

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The human genome encodes 1500–2000 different transcription factors (TFs). ChIP-seq is revealing the global binding profiles of a fraction of TFs in a fraction of their biological contexts. These data show that the majority of TFs bind directly next to a large number of context-relevant target genes, that most binding is distal, and that binding is context specific. Because of the effort and cost involved, ChIP-seq is seldom used in search of novel TF function. Such exploration is instead done using expression perturbation and genetic screens. Here we propose a comprehensive computational framework for transcription factor function prediction. We curate 332 high-quality nonredundant TF binding motifs that represent all major DNA binding domains, and improve cross-species conserved binding site prediction to obtain 3.3 million conserved, mostly distal, binding site predictions. We combine these with 2.4 million facts about all human and mouse gene functions, in a novel statistical framework, in search of enrichments of particular motifs next to groups of target genes of particular functions. Rigorous parameter tuning and a harsh null are used to minimize false positives. Our novel PRISM (predicting regulatory information from single motifs) approach obtains 2543 TF function predictions in a large variety of contexts, at a false discovery rate of 16%. The predictions are highly enriched for validated TF roles, and 45 of 67 (67%) tested binding site regions in five different contexts act as enhancers in functionally matched cells.
机译:人类基因组编码1500-2000种不同的转录因子(TF)。 ChIP-seq在一部分生物学背景下揭示了一部分TF的整体结合情况。这些数据表明,大多数TF直接与大量上下文相关的靶基因直接结合,大多数结合是远侧的,并且结合是特定于上下文的。由于涉及的工作量和成本,ChIP-seq很少用于寻找新颖的TF功能。相反,使用表达扰动和遗传筛选来完成这种探索。在这里,我们提出了一个用于转录因子功能预测的综合计算框架。我们策划了332个高质量的非冗余TF结合基序,这些基序代表了所有主要的DNA结合域,并改善了跨物种保守的结合位点预测,从而获得了330万个保守的,主要是远端的结合位点预测。我们在一个新颖的统计框架中将这些与有关所有人类和小鼠基因功能的240万个事实结合在一起,以寻找具有特定功能的目标基因组旁边特定基序的富集。严格的参数调整和苛刻的null用于最小化误报。我们新颖的PRISM(从单个基元预测监管信息)方法可在多种情况下获得2543 TF功能预测,错误发现率为16%。对于经过验证的TF角色,预测高度丰富,并且在五个不同情况下测试的67个结合位点区域中的45个(67%)在功能匹配的细胞中充当增强子。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号