首页> 美国卫生研究院文献>PLoS Genetics >Transcript Annotation in FANTOM3: Mouse Gene Catalog Based on Physical cDNAs
【2h】

Transcript Annotation in FANTOM3: Mouse Gene Catalog Based on Physical cDNAs

机译:FANTOM3中的转录物注释:基于物理cDNA的小鼠基因目录

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The international FANTOM consortium aims to produce a comprehensive picture of the mammalian transcriptome, based upon an extensive cDNA collection and functional annotation of full-length enriched cDNAs. The previous dataset, FANTOM2, comprised 60,770 full-length enriched cDNAs. Functional annotation revealed that this cDNA dataset contained only about half of the estimated number of mouse protein-coding genes, indicating that a number of cDNAs still remained to be collected and identified. To pursue the complete gene catalog that covers all predicted mouse genes, cloning and sequencing of full-length enriched cDNAs has been continued since FANTOM2. In FANTOM3, 42,031 newly isolated cDNAs were subjected to functional annotation, and the annotation of 4,347 FANTOM2 cDNAs was updated. To accomplish accurate functional annotation, we improved our automated annotation pipeline by introducing new coding sequence prediction programs and developed a Web-based annotation interface for simplifying the annotation procedures to reduce manual annotation errors. Automated coding sequence and function prediction was followed with manual curation and review by expert curators. A total of 102,801 full-length enriched mouse cDNAs were annotated. Out of 102,801 transcripts, 56,722 were functionally annotated as protein coding (including partial or truncated transcripts), providing to our knowledge the greatest current coverage of the mouse proteome by full-length cDNAs. The total number of distinct non-protein-coding transcripts increased to 34,030. The FANTOM3 annotation system, consisting of automated computational prediction, manual curation, and final expert curation, facilitated the comprehensive characterization of the mouse transcriptome, and could be applied to the transcriptomes of other species.
机译:国际FANTOM联盟旨在基于广泛的cDNA收集和全长富集cDNA的功能注释,来制作哺乳动物转录组的全面图片。先前的数据集FANTOM2包含60,770个全长富集cDNA。功能注释显示,该cDNA数据集仅包含估计数量的小鼠蛋白质编码基因的大约一半,这表明仍有许多cDNA需要收集和鉴定。为了追求涵盖所有预测的小鼠基因的完整基因目录,自FANTOM2起,全长富集cDNA的克隆和测序一直在继续。在FANTOM3中,对42,031个新分离的cDNA进行了功能注释,并更新了4,347个FANTOM2 cDNA的注释。为了完成准确的功能注释,我们通过引入新的编码序列预测程序改进了自动注释流程,并开发了基于Web的注释界面,以简化注释过程以减少手动注释错误。自动编码序列和功能预测之后,由专业策展人进行人工策展和审查。总共注释了102,801个全长富集的小鼠cDNA。在102,801个转录本中,有56,722个在功能上被注释为蛋白质编码(包括部分或截短的转录本),从而为我们提供了全长cDNA对小鼠蛋白质组的最大覆盖面。不同的非蛋白质编码转录本的总数增加到34,030。 FANTOM3批注系统由自动计算预测,手动管理和最终专家管理组成,可促进对小鼠转录组的全面表征,并可应用于其他物种的转录组。

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号