首页> 美国卫生研究院文献>Proceedings of the National Academy of Sciences of the United States of America >PNAS Plus: Exploiting regulatory heterogeneity to systematically identify enhancers with high accuracy
【2h】

PNAS Plus: Exploiting regulatory heterogeneity to systematically identify enhancers with high accuracy

机译:PNAS Plus:利用监管异质性以高精度系统地识别增强子

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Identifying functional enhancer elements in metazoan systems is a major challenge. Large-scale validation of enhancers predicted by ENCODE reveal false-positive rates of at least 70%. We used the pregrastrula-patterning network of Drosophila melanogaster to demonstrate that loss in accuracy in held-out data results from heterogeneity of functional signatures in enhancer elements. We show that at least two classes of enhancers are active during early Drosophila embryogenesis and that by focusing on a single, relatively homogeneous class of elements, greater than 98% prediction accuracy can be achieved in a balanced, completely held-out test set. The class of well-predicted elements is composed predominantly of enhancers driving multistage segmentation patterns, which we designate segmentation driving enhancers (SDE). Prediction is driven by the DNA occupancy of early developmental transcription factors, with almost no additional power derived from histone modifications. We further show that improved accuracy is not a property of a particular prediction method: after conditioning on the SDE set, naïve Bayes and logistic regression perform as well as more sophisticated tools. Applying this method to a genome-wide scan, we predict 1,640 SDEs that cover 1.6% of the genome. An analysis of 32 SDEs using whole-mount embryonic imaging of stably integrated reporter constructs chosen throughout our prediction rank-list showed >90% drove expression patterns. We achieved 86.7% precision on a genome-wide scan, with an estimated recall of at least 98%, indicating high accuracy and completeness in annotating this class of functional elements.
机译:识别后生动物系统中的功能增强子是一项重大挑战。 ENCODE预测的增强剂的大规模验证显示假阳性率至少为70%。我们使用果蝇果蝇的pregrastrula-patterning网络来证明在保留的数据中准确性的损失是由增强子元素功能签名的异质性导致的。我们显示了至少两类增强子在果蝇早期胚胎发生过程中是活跃的,并且通过集中于一个相对均一的元素类,可以在平衡,完全不受约束的测试集中实现98%以上的预测准确性。预测良好的元素类别主要由驱动多级分割模式的增强器组成,我们将其指定为分割驱动增强器(SDE)。预测是由早期发育转录因子的DNA占用驱动的,而组蛋白修饰几乎没有额外的功能。我们进一步表明,提高的准确性不是特定预测方法的属性:在对SDE集进行条件处理后,朴素的贝叶斯和逻辑回归可以执行更复杂的工具。将这种方法应用于全基因组扫描,我们预测有1,640个SDE覆盖了1.6%的基因组。使用在整个预测等级列表中选择的稳定整合的报告基因构建物的完整安装胚胎成像对32种SDE进行分析,结果显示> 90%的驱动表达模式。在全基因组扫描中,我们达到了86.7%的精确度,估计召回率至少为98%,这表明在注释此类功能元件时具有很高的准确性和完整性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号