...
首页> 外文期刊>Journal of structural and functional genomics >Coverage of whole proteome by structural genomics observed through protein homology modeling database
【24h】

Coverage of whole proteome by structural genomics observed through protein homology modeling database

机译:通过蛋白质同源性建模数据库观察到的结构基因组学对整个蛋白质组的覆盖

获取原文

摘要

We have been developing FAMSBASE, a protein homology-modeling database of whole ORFs predicted from genome sequences. The latest update of FAMSBASE (http://daisy.nagahama-i-bio.ac.jp/Famsbase/), which is based on the protein three-dimensional (3D) structures released by November 2003, contains modeled 3D structures for 368,724 open reading frames (ORFs) derived from genomes of 276 species, namely 17 archaebacterial, 130 eubacterial, 18 eukaryotic and 111 phage genomes. Those 276 genomes are predicted to have 734,193 ORFs in total and the current FAMSBASE contains protein 3D structure of approximately 50% of the ORF products. However, cases that a modeled 3D structure covers the whole part of an ORF product are rare. When portion of an ORF with 3D structure is compared in three kingdoms of life, in archaebacteria and eubacteria, approximately 60% of the ORFs have modeled 3D structures covering almost the entire amino acid sequences, however, the percentage falls to about 30% in eukaryotes. When annual differences in the number of ORFs with modeled 3D structure are calculated, the fraction of modeled 3D structures of soluble protein for archaebacteria is increased by 5%, and that for eubacteria by 7% in the last 3?years. Assuming that this rate would be maintained and that determination of 3D structures for predicted disordered regions is unattainable, whole soluble protein model structures of prokaryotes without the putative disordered regions will be in hand within 15?years. For eukaryotic proteins, they will be in hand within 25?years. The 3D structures we will have at those times are not the 3D structure of the entire proteins encoded in single ORFs, but the 3D structures of separate structural domains. Measuring or predicting spatial arrangements of structural domains in an ORF will then be a coming issue of structural genomics.
机译:我们一直在开发FAMSBASE,这是从基因组序列预测的整个ORF的蛋白质同源性建模数据库。 FAMSBASE的最新更新(http://daisy.nagahama-i-bio.ac.jp/Famsbase/)基于2003年11月发布的蛋白质三维(3D)结构,包含用于368,724的建模3D结构开放阅读框(ORFs)来自276个物种的基因组,即17个古细菌,130个真细菌,18个真核生物和111个噬菌体基因组。预测这276个基因组总共有734,193个ORF,而目前的FAMSBASE包含大约50%ORF产物的蛋白质3D结构。但是,很少有模型3D结构覆盖ORF产品的整个部分的情况。在三个生命王国中比较古细菌和真细菌中具有3D结构的ORF的一部分时,大约60%的ORF已建模了几乎覆盖整个氨基酸序列的3D结构,但是,在真核生物中该比例下降到了大约30% 。当计算出具有建模3D结构的ORF的年度差异时,过去3年中,古细菌可溶性蛋白质的3D结构建模比例增加了5%,而真细菌的可溶性蛋白质则增加了7%。假设将维持该速率并且无法确定预测的无序区域的3D结构,则将在15年之内掌握没有假定无序区域的原核生物的完整可溶性蛋白质模型结构。对于真核蛋白质,它们将在25年内交付。那时我们将拥有的3D结构不是单个ORF中编码的整个蛋白质的3D结构,而是单独结构域的3D结构。测量或预测ORF中结构域的空间排列将成为结构基因组学的未来问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号