首页> 外文期刊>Proceedings of the National Academy of Sciences of the United States of America >Identification Of Gene 3' Ends By Automated Est Cluster Analysis
【24h】

Identification Of Gene 3' Ends By Automated Est Cluster Analysis

机译:通过自动Est聚类分析识别基因3'末端

获取原文
获取原文并翻译 | 示例
       

摘要

The properties and biology of mRNA transcripts can be affected profoundly by the choice of alternative polyadenylation sites, making definition of the 3' ends of transcripts essential for understanding their regulation. Here we show that 22-52% of sequences in commonly used human and murine "full-length" transcript databases may not currently end at bona fide polyadenylation sites. To identify probable transcript termini over the entire murine and human genomes, we analyzed the EST databases for positional clustering of EST ends. The analysis yielded 58,282 murine- and 86,410 human-candidate polyadenylation sites, of which 75% mapped to 23,091 known murine transcripts and 22,891 known human transcripts. The murine dataset correctly predicted 97% of the 3' ends in a manually curated and experimentally supported benchmark transcript set. Of currently known genes, 15% had no associated prediction and 25% had only a single predicted termination site. The remaining genes had an average of 3-4 alternative polyadenylation sites predicted for each murine or human transcript, respectively. The results are made available in the form of tables and an interactive web site that can be mined for rapid assessment of the validity of 3' ends in existing collections, enumeration of potential alternative 3' polyadenylation sites of known transcripts, direct retrieval of terminal sequences for design of probes, and detection of polyadenylation sites not currently mapped to known genes.
机译:mRNA转录物的特性和生物学特性可通过选择其他聚腺苷酸化位点而受到深远影响,因此,对转录物3'端的定义对于理解其调控至关重要。在这里,我们显示了常用的人类和鼠类“全长”转录本数据库中22-52%的序列当前可能不会在真正的聚腺苷酸化位点终止。为了确定整个鼠类和人类基因组中可能的转录物末端,我们分析了EST数据库的EST末端位置聚类情况。该分析产生了58,282个鼠类和86,410个人候选聚腺苷酸化位点,其中75%定位于23,091个已知鼠类转录本和22,891个已知人转录本。鼠类数据集可正确预测手动整理和实验支持的基准成绩单集中3%末端的97%。在目前已知的基因中,有15%的人没有相关的预测,而25%的人只有一个预测的终止位点。其余基因分别为每个鼠或人转录本预测平均有3-4个多聚腺苷酸化位点。结果以表格和交互式网站的形式提供,可以进行挖掘以快速评估现有馆藏中3'末端的有效性,列举已知转录本的潜在替代3'聚腺苷酸位点,直接检索末端序列用于设计探针,以及检测当前未定位到已知基因的聚腺苷酸化位点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号