首页> 美国卫生研究院文献>Genome Research >Combining RT-PCR-seq and RNA-seq to catalog all genic elements encoded in the human genome
【2h】

Combining RT-PCR-seq and RNA-seq to catalog all genic elements encoded in the human genome

机译:结合RT-PCR-seq和RNA-seq来分类人类基因组中编码的所有基因元件

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Within the ENCODE Consortium, GENCODE aimed to accurately annotate all protein-coding genes, pseudogenes, and noncoding transcribed loci in the human genome through manual curation and computational methods. Annotated transcript structures were assessed, and less well-supported loci were systematically, experimentally validated. Predicted exon–exon junctions were evaluated by RT-PCR amplification followed by highly multiplexed sequencing readout, a method we called RT-PCR-seq. Seventy-nine percent of all assessed junctions are confirmed by this evaluation procedure, demonstrating the high quality of the GENCODE gene set. RT-PCR-seq was also efficient to screen gene models predicted using the Human Body Map (HBM) RNA-seq data. We validated 73% of these predictions, thus confirming 1168 novel genes, mostly noncoding, which will further complement the GENCODE annotation. Our novel experimental validation pipeline is extremely sensitive, far more than unbiased transcriptome profiling through RNA sequencing, which is becoming the norm. For example, exon–exon junctions unique to GENCODE annotated transcripts are five times more likely to be corroborated with our targeted approach than with extensive large human transcriptome profiling. Data sets such as the HBM and ENCODE RNA-seq data fail sampling of low-expressed transcripts. Our RT-PCR-seq targeted approach also has the advantage of identifying novel exons of known genes, as we discovered unannotated exons in ∼11% of assessed introns. We thus estimate that at least 18% of known loci have yet-unannotated exons. Our work demonstrates that the cataloging of all of the genic elements encoded in the human genome will necessitate a coordinated effort between unbiased and targeted approaches, like RNA-seq and RT-PCR-seq.
机译:在ENCODE联盟内部,GENCODE旨在通过手动管理和计算方法来准确注释人类基因组中所有蛋白质编码基因,假基因和非编码转录基因座。注释的成绩单结构进行了评估,并且系统地,实验地验证了支持程度较低的基因座。通过RT-PCR扩增,然后进行高度多重的测序读数(我们称为RT-PCR-seq的方法)来评估预测的外显子-外显子连接。此评估程序确认了所有评估连接点的79%,证明了GENCODE基因集的高质量。 RT-PCR-seq还可以有效筛选使用人体图谱(HBM)RNA-seq数据预测的基因模型。我们验证了这些预测的73%,从而确认了1168个新基因,其中大多数为非编码基因,这将进一步补充GENCODE注释。我们新颖的实验验证流程极其敏感,远远超过了通过RNA测序进行无偏见的转录组分析(这已成为规范)。例如,与广泛的大型人类转录组分析相比,采用我们的靶向方法证实的GENCODE注释转录本独特的外显子-外显子连接可能性高出五倍。诸如HBM和ENCODE RNA-seq数据之类的数据集无法对低表达的转录本进行采样。我们的RT-PCR-seq靶向方法还具有识别已知基因的新外显子的优势,因为我们在约11%的评估内含子中发现了未注释的外显子。因此,我们估计至少18%的已知基因座具有尚未注释的外显子。我们的工作表明,对人类基因组中编码的所有基因元件进行分类将需要在无偏和靶向方法(如RNA-seq和RT-PCR-seq)之间进行协调的工作。

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号