首页> 美国卫生研究院文献>Genome Research >Millions of Years of Evolution Preserved: A Comprehensive Catalog of the Processed Pseudogenes in the Human Genome
【2h】

Millions of Years of Evolution Preserved: A Comprehensive Catalog of the Processed Pseudogenes in the Human Genome

机译:保留数百万年的进化:人类基因组中经过加工的假基因的完整目录

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Processed pseudogenes were created by reverse-transcription of mRNAs; they provide snapshots of ancient genes existing millions of years ago in the genome. To find them in the present-day human, we developed a pipeline using features such as intron-absence, frame-disruption, polyadenylation, and truncation. This has enabled us to identify in recent genome drafts ∼8000 processed pseudogenes (distributed from ). Overall, processed pseudogenes are very similar to their closest corresponding human gene, being 94% complete in coding regions, with sequence similarity of 75% for amino acids and 86% for nucleotides. Their chromosomal distribution appears random and dispersed, with the numbers on chromosomes proportional to length, suggesting sustained “bombardment” over evolution. However, it does vary with GC-content: Processed pseudogenes occur mostly in intermediate GC-content regions. This is similar to Alus but contrasts with functional genes and L1-repeats. Pseudogenes, moreover, have age profiles similar to Alus. The number of pseudogenes associated with a given gene follows a power-law relationship, with a few genes giving rise to many pseudogenes and most giving rise to few. The prevalence of processed pseudogenes agrees well with germ-line gene expression. Highly expressed ribosomal proteins account for ∼20% of the total. Other notables include cyclophilin-A, keratin, GAPDH, and cytochrome c.
机译:通过mRNA的逆转录产生加工的假基因;他们提供了基因组中数百万年前存在的古代基因的快照。为了在当今的人类中找到它们,我们开发了一条管道,使用了诸如内含子缺失,框架破坏,聚腺苷酸化和截短等功能。这使我们能够在最近的基因组草图中鉴定出约8000个经过处理的假基因(从分配)。总体而言,加工后的假基因与它们最接近的相应人类基因非常相似,在编码区完整度为94%,氨基酸的序列相似性为75%,核苷酸的相似度为86%。它们的染色体分布似乎是随机的和分散的,染色体上的数字与长度成正比,表明在进化过程中持续的“轰击”。但是,它的确随GC含量而变化:处理后的假基因主要出现在中间GC含量区域。这与Alus相似,但与功能基因和L1重复序列形成对比。此外,假基因的年龄特征与阿鲁斯相似。与给定基因相关的假基因的数量遵循幂律关系,少数基因产生许多假基因,而大多数基因产生很少。加工后的假基因的流行与种系基因表达非常吻合。高表达的核糖体蛋白约占总数的20%。其他值得注意的包括亲环蛋白A,角蛋白,GAPDH和细胞色素c。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号