Loose ends: almost one in five human genes still have unresolved coding status

Abascal Federico; Juan David; Jungreis Irwin; Martinez Laura; Rigau Maria; Manuel Rodriguez Jose; Vazquez Jesus; Tress Michael L.

首页> 外文期刊>Nucleic Acids Research >Loose ends: almost one in five human genes still have unresolved coding status

【24h】

Loose ends: almost one in five human genes still have unresolved coding status

机译：松散的目的：五分之一的人类基因仍然具有未解决的编码状态

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Seventeen years after the sequencing of the human genome, the human proteome is still under revision. One in eight of the 22 210 coding genes listed by the Ensembl/GENCODE, RefSeq and UniProtKB reference databases are annotated differently across the three sets. We have carried out an in-depth investigation on the 2764 genes classified as coding by one or more sets of manual curators and not coding by others. Data from large-scale genetic variation analyses suggests that most are not under protein-like purifying selection and so are unlikely to code for functional proteins. A further 1470 genes annotated as coding in all three reference sets have characteristics that are typical of non-coding genes or pseudogenes. These potential non-coding genes also appear to be undergoing neutral evolution and have considerably less supporting transcript and protein evidence than other coding genes. We believe that the three reference databases currently overestimate the number of human coding genes by at least 2000, complicating and adding noise to large-scale biomedical experiments. Determining which potential non-coding genes do not code for proteins is a difficult but vitally important task since the human reference proteome is a fundamental pillar of most basic research and supports almost all large-scale biomedical projects.

机译：人类基因组测序后的十七年，人蛋白质组仍在修改。由Ensembl / Gencode，Refseq和Uniprotkb参考数据库列出的22个210个编码基因中的八个中的八个在三组上以不同方式注释。我们对由一个或多个手动策展人进行编码而不是由他人编码的2764个基因进行了深入的调查。来自大规模遗传变异分析的数据表明，大多数不是蛋白质的净化选择，因此不太可能代码功能蛋白质。在所有三个参考组中编码的另外的1470个基因被注释为典型的非编码基因或假生素的特征。这些潜在的非编码基因也似乎正在进行中性演化，并且具有比其他编码基因的转录成分和蛋白质证据相当较低。我们认为，三个参考数据库目前至少将人类编码基因数量估计至少2000年，使噪声复杂化并向大规模生物医学实验增加。确定哪些潜在的非编码基因不是蛋白质的代码是一种困难而最重要的任务，因为人类参考蛋白质组是大多数基本研究的基本支柱，并支持几乎所有大规模生物医学项目。

著录项

来源
《Nucleic Acids Research》 |2018年第14期|共15页
作者
Abascal Federico; Juan David; Jungreis Irwin; Martinez Laura; Rigau Maria; Manuel Rodriguez Jose; Vazquez Jesus; Tress Michael L.;
展开▼
作者单位

Wellcome Trust Sanger Inst Hinxton CB10 1SA Cambs England;

Univ Pompeu Fabra Comparat Genom Lab Inst Biol Evolut Barcelona Spain;

MIT Comp Sci &

Artificial Intelligence Lab 77 Massachusetts Ave Cambridge MA 02139 USA;

Spanish Natl Canc Res Ctr Bioinformat Unit Madrid Spain;

Barcelona Supercomp Ctr Computat Biol Life Sci Grp Barcelona Spain;

Ctr Nacl Invest Cardiovasc Cardiovasc Prote Lab Madrid Spain;

Ctr Nacl Invest Cardiovasc Cardiovasc Prote Lab Madrid Spain;

Spanish Natl Canc Res Ctr Bioinformat Unit Madrid Spain;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类生物化学;
关键词

相似文献

外文文献
中文文献
专利

1. Loose ends: almost one in five human genes still have unresolved coding status (vol 46, pg 7070, 2018) [J] . Abascal Federico, Juan David, Jungreis Irwin, Nucleic Acids Research . 2018,第22期

机译：松散目的：五分之一的人类基因仍有未解决的编码状态（Vol 46，PG 7070,2018）
2. Loose ends: almost one in five human genes still have unresolved coding status [J] . Abascal Federico, Juan David, Jungreis Irwin, Nucleic Acids Research . 2018,第14期

机译：松散的目的：五分之一的人类基因仍然具有未解决的编码状态
3. Loose ends: almost one in five human genes still have unresolved coding status [J] . Federico Abascal, David Juan, Irwin Jungreis, Nucleic acids research . 2018,第14期

机译：结局松散：几乎五分之一的人类基因仍具有未解析的编码状态
4. DeepCNPP: Deep Learning Architecture to Distinguish the Promoter of Human Long Non-Coding RNA Genes and Protein-Coding Genes [C] . Tanvir ALAM ICIMTH . 2019

机译：Deepcnpp：深入学习架构，以区分人类长非编码RNA基因和蛋白质编码基因的启动子
5. Aberrant epigenetic silencing of tumor suppressor genes in human cancer: The roles of DNA hypermethylation and the histone code. [D] . Fahrner, Jill A. 2005

机译：人类癌症中肿瘤抑制基因的异常表观遗传沉默：DNA高度甲基化和组蛋白密码的作用。
6. Loose ends: almost one in five human genes still have unresolved coding status [O] . Federico Abascal, David Juan, Irwin Jungreis, 2018

机译：宽松的结局：几乎五分之一的人类基因仍具有未解析的编码状态
7. Loose ends: almost one in five human genes still have unresolved coding status [O] . Federico Abascal, David Juan, Irwin Jungreis, 2018

机译：松散的目的：五分之一的人类基因仍然具有未解决的编码状态
8. Status of Safety Issues at Licensed Power Plants. TMI Action Plan Requirements,Unresolved Safety Issues, Generic Safety Issues, Other Multiplant Action Issues. Supplement 3 [R] . 1993

机译：许可电厂的安全问题现状。 TmI行动计划要求，未解决的安全问题，一般安全问题，其他多工作行动问题。补编3

Loose ends: almost one in five human genes still have unresolved coding status

摘要

著录项

相似文献

相关主题

期刊订阅