...
首页> 外文期刊>BMC Bioinformatics >Analysis of superfamily specific profile-profile recognition accuracy
【24h】

Analysis of superfamily specific profile-profile recognition accuracy

机译:超家族特定的轮廓-轮廓识别精度分析

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Background Annotation of sequences that share little similarity to sequences of known function remains a major obstacle in genome annotation. Some of the best methods of detecting remote relationships between protein sequences are based on matching sequence profiles. We analyse the superfamily specific performance of sequence profile-profile matching. Our benchmark consists of a set of 16 protein superfamilies that are highly diverse at the sequence level. We relate the performance to the number of sequences in the profiles, the profile diversity and the extent of structural conservation in the superfamily. Results The performance varies greatly between superfamilies with the truncated receiver operating characteristic, ROC 10, varying from 0.95 down to 0.01. These large differences persist even when the profiles are trimmed to approximately the same level of diversity. Conclusions Although the number of sequences in the profile (profile width) and degree of sequence variation within positions in the profile (profile diversity) contribute to accurate detection there are other superfamily specific factors.
机译:背景与已知功能的序列几乎没有相似性的序列注释仍然是基因组注释的主要障碍。检测蛋白质序列之间的远程关系的一些最佳方法是基于匹配的序列图谱。我们分析了序列谱-谱匹配的超家族特异性表现。我们的基准测试由一组16个蛋白质超家族组成,它们在序列水平上高度不同。我们将性能与概况中序列的数量,概况多样性和超家族中结构保守的程度联系起来。结果截短的接收器工作特性ROC 10 在超家族之间的性能差异很大,从0.95下降到0.01。即使将轮廓修整到大致相同的多样性水平,这些大差异仍然存在。结论尽管轮廓中的序列数量(轮廓宽度)和轮廓中位置内的序列变异程度(轮廓多样性)有助于准确检测,但还有其他超家族特异性因素。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号