...
首页> 外文期刊>Nucleic acids research >A comparison of several similarity indices used in the classification of protein sequences: a multivariate analysis
【24h】

A comparison of several similarity indices used in the classification of protein sequences: a multivariate analysis

机译:蛋白质序列分类中使用的几种相似性指标的比较:多变量分析

获取原文
   

获取外文期刊封面封底 >>

       

摘要

The present work describes an attempt to identify reliable criteria which could be used as distance Indices between protein sequences. Seven different criteria have been tested: i and II) the scores of the alignments as given by the BESTFIT and the FASTA programs; ill) the ratio parametrer, i.e. the BESTFIT score divided by the length of the aligned peptides; iv and v) the statistical significance (Z-scores) of the scores calculated by BESTFIT and FASTA, as obtained by comparison with shuffled sequences; vi) the Z-scores provided by the program RELATE which performs a segment-by-segment comparison of 2 sequences, and vii) an original distance index calculated by the program DOCMA from all the pairwise dotplots between the sequences. These 7 criteria have been tested against the aminoacid sequences of 39 globins and those of the 20 aminoacyl-tRNA synthetases from E. coli. The distances between the sequences were analyzed by the multivariate analysis techniques. The results show that the distances calculated from the scores of the pairwise alignments are not adequately sensitive. The Z-score from RELATE is not selective enough and too demanding in computer time. Three criteria gave a classification consistent with the known similarities between the sequences in the sets, namely the Z-scores from BESTFIT and FASTA and the multiple dotplot comparison distance index from DOCMA.
机译:本工作描述了尝试鉴定可靠的标准的尝试,该标准可用作蛋白质序列之间的距离指标。测试了七个不同的标准:i和II)BESTFIT和FASTA程序给出的比对得分; ill)比率参数,即BESTFIT得分除以比对肽段的长度; iv和v)BESTFIT和FASTA计算的得分的统计显着性(Z得分),通过与改组序列进行比较获得; vi)由程序RELATE提供的Z分数,该程序执行2个序列的逐段比较,并且vii)由程序DOCMA根据序列之间的所有成对点图计算出的原始距离指数。针对来自大肠杆菌的39种珠蛋白和20种氨酰基-tRNA合成酶的氨基酸序列测试了这7条标准。序列之间的距离通过多元分析技术进行分析。结果表明,从成对比对的分数计算出的距离不够灵敏。来自RELATE的Z得分选择不够,对计算机时间的要求也很高。三个标准给出了与集合中序列之间已知相似性相一致的分类,即BESTFIT和FASTA的Z得分以及DOCMA的多点图比较距离指数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号