首页> 外文期刊>Bioinformatics >Iterative sequence/secondary structure search for protein homologs: comparison with amino acid sequence alignments and application to fold recognition in genome databases.
【24h】

Iterative sequence/secondary structure search for protein homologs: comparison with amino acid sequence alignments and application to fold recognition in genome databases.

机译:重复序列/二级结构搜索蛋白质同源物:与氨基酸序列比对进行比较,并应用于基因组数据库中的折叠识别。

获取原文
获取原文并翻译 | 示例
       

摘要

MOTIVATION: Sequence alignment techniques have been developed into extremely powerful tools for identifying the folding families and function of proteins in newly sequenced genomes. For a sufficiently low sequence identity it is necessary to incorporate additional structural information to positively detect homologous proteins. We have carried out an extensive analysis of the effectiveness of incorporating secondary structure information directly into the alignments for fold recognition and identification of distant protein homologs. A secondary structure similarity matrix based on a database of three-dimensionally aligned proteins was first constructed. An iterative application of dynamic programming was used which incorporates linear combinations of amino acid and secondary structure sequence similarity scores. Initially, only primary sequence information is used. Subsequently contributions from secondary structure are phased in and new homologous proteins are positively identified if their scores are consistent with the predetermined error rate. RESULTS: We used the SCOP40 database, where only PDB sequences that have 40% homology or less are included, to calibrate homology detection by the combined amino acid and secondary structure sequence alignments. Combining predicted secondary structure with sequence information results in a 8-15% increase in homology detection within SCOP40 relative to the pairwise alignments using only amino acid sequence data at an error rate of 0.01 errors per query; a 35% increase is observed when the actual secondary structure sequences are used. Incorporating predicted secondary structure information in the analysis of six small genomes yields an improvement in the homology detection of approximately 20% over SSEARCH pairwise alignments, but no improvement in the total number of homologs detected over PSI-BLAST, at an error rate of 0.01 errors per query. However, because the pairwise alignments based on combinations of amino acid and secondary structure similarity are different from those produced by PSI-BLAST and the error rates can be calibrated, it is possible to combine the results of both searches. An additional 25% relative improvement in the number of genes identified at an error rate of 0.01 is observed when the data is pooled in this way. Similarly for the SCOP40 dataset, PSI-BLAST detected 15% of all possible homologs, whereas the pooled results increased the total number of homologs detected to 19%. These results are compared with recent reports of homology detection using sequence profiling methods. AVAILABILITY: Secondary structure alignment homepage at http://lutece.rutgers.edu/ssas CONTACT: anders
机译:动机:序列比对技术已发展成为功能强大的工具,可用于识别新测序基因组中蛋白质的折叠家族和功能。对于足够低的序列同一性,有必要整合其他结构信息以阳性检测同源蛋白。我们已经进行了广泛的分析,将二级结构信息直接整合到比对中以识别和鉴定远距离蛋白质同源物的比对的有效性。首先构建了基于三维比对蛋白质数据库的二级结构相似性矩阵。使用了动态编程的迭代应用程序,该程序结合了氨基酸和二级结构序列相似性评分的线性组合。最初,仅使用主序列信息。随后,逐步引入二级结构的贡献,如果新的同源蛋白的得分与预定的错误率一致,则可以肯定地鉴定出新的同源蛋白。结果:我们使用了SCOP40数据库,其中仅包含40%或更少同源性的PDB序列,通过组合的氨基酸和二级结构序列比对来校准同源性检测。将预测的二级结构与序列信息相结合,相对于仅使用氨基酸序列数据的成对比对,SCOP40内的同源性检测提高了8-15%,每次查询的错误率均为0.01。当使用实际的二级结构序列时,观察到增加了35%。将预测的二级结构信息纳入六个小基因组的分析中,与SSEARCH成对比对相比,同源性检测提高了约20%,但与PSI-BLAST相比,检测到的同源物总数没有改善,错误率为0.01个错误每个查询。但是,由于基于氨基酸和二级结构相似性组合的成对比对与PSI-BLAST产生的成对比对不同,并且可以校正错误率,因此可以将两种搜索的结果结合在一起。当以这种方式汇总数据时,观察到以0.01的错误率识别的基因数量又增加了25%的相对改善。类似地,对于SCOP40数据集,PSI-BLAST检测到所有可能的同源物的15%,而合并的结果将检测到的同源物的总数增加到19%。将这些结果与使用序列分析方法进行同源性检测的最新报道进行了比较。可用性:二级结构对齐主页,网址为http://lutece.rutgers.edu/ssas联系人:anders

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号