首页> 外文期刊>DNA research: an international journal for rapid publication of reports on genes and genomes >Prediction of the Coding Sequences of Unidentified Human Genes. I. The Coding Sequences of 40 New Genes (KIAA0001-KIAA0040) Deduced by Analysis of Randomly Sampled cDNA Clones from Human Immature Myeloid Cell Line KG-1
【24h】

Prediction of the Coding Sequences of Unidentified Human Genes. I. The Coding Sequences of 40 New Genes (KIAA0001-KIAA0040) Deduced by Analysis of Randomly Sampled cDNA Clones from Human Immature Myeloid Cell Line KG-1

机译:未知人类基因编码序列的预测。 I.通过分析来自人类未成熟骨髓细胞系KG-1的随机采样cDNA克隆推导的40个新基因的编码序列(KIAA0001-KIAA0040)

获取原文
           

摘要

We established a protocol for the prediction of the coding sequences of unidentified human genes based on the double selection and sequence analysis of cDNA clones with inserts carrying unreported 5′-terminal sequences and with insert sizes corresponding to nearly full-length transcripts. By applying the protocol, cDNA clones with inserts longer than 2 kb were isolated from a cDNA library of human immature myeloid cell line KG-1, and the coding sequences of 40 new genes were predicted. A computer search of the sequences indicated that 20 genes contained sequences similar to known genes in the GenBank/EMBL databases. The sequences of the remaining 20 genes were entirely new, and characteristic protein motifs or domains were identified in 32 genes. Other sequence features noted were that the coding sequences of 23 genes were followed by relatively long stretches of 3′-untranslated sequences and that 5 genes contained repetitive sequences in their 3′-untranslated regions. The chromosomal location of these genes has been determined. By increasing the scale of the above analysis, the coding sequences of many unidentified genes can be predicted.
机译:我们基于带有克隆的cDNA克隆的双重选择和序列分析,建立了一个预测未鉴定人基因编码序列的协议,该cDNA克隆带有携带未报告的5'-末端序列的插入片段,并且插入片段的大小对应于几乎全长的转录本。通过应用该协议,从人未成熟骨髓细胞系KG-1的cDNA文库中分离出插入长度超过2 kb的cDNA克隆,并预测了40个新基因的编码序列。对序列的计算机搜索表明,有20个基因包含的序列与GenBank / EMBL数据库中的已知基因相似。其余20个基因的序列是全新的,并且在32个基因中鉴定出了特征性的蛋白质基序或结构域。注意到的其他序列特征是23个基因的编码序列之后是较长的3'非翻译序列,并且5个基因在其3'非翻译区中包含重复序列。这些基因的染色体位置已经确定。通过增加上述分析的规模,可以预测许多未鉴定基因的编码序列。

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号