首页> 外文会议>19th international symposium on high performance distributed computing 2010 >Modeling sequence and function similarity between proteins for protein functional annotation
【24h】

Modeling sequence and function similarity between proteins for protein functional annotation

机译:蛋白质之间的序列和功能相似性建模,用于蛋白质功能注释

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

A common task in biological research is to predict function for proteins by comparing sequences between proteins of known and unknown function. This is often done using pair-wise sequence alignment algorithms (e.g. BLAST). A problem with this approach is the assumption of a simple equivalence between a minimum sequence similarity threshold and the Junction similarity between proteins. This assumption is based on the binary concept of homology in that proteins are or not homologous. The relationship between sequence and function however is more complex as well as pertinent for predicting protein function, e.g. evaluating BLAST alignments or developing training sets for profile models based on functional rather than homologous groupings. Our motivation for this study was to model sequence and function similarity between proteins to gain insights into the "sequence-function similarity relationship between proteins for predicting function. Using our model we found that function similarity generally increases with sequence similarity but with a high degree of variability. This result has implications for pair-wise approaches in that it appears sequence similarity must be very high to ensure high function similarity. Profile models which enable higher sensitivity are a potential solution. However, multiple sequences alignments (a necessary prerequisite) are a problem in that current algorithms have difficulty aligning sequences with very low sequence similarity, which is common in our data set, or are intractable for high numbers of sequences. Given the importance of predicting protein function and the need for multiple sequence alignments, algorithms for accomplishing this task should be further refined and developed.
机译:生物学研究中的一项常见任务是通过比较已知功能和未知功能的蛋白质之间的序列来预测蛋白质的功能。这通常是使用成对序列比对算法(例如BLAST)完成的。这种方法的问题是假设最小序列相似性阈值和蛋白质之间的连接相似性之间具有简单的等价性。该假设基于同源性的二元概念,即蛋白质是否同源。然而,序列与功能之间的关系更为复杂,并且与预测蛋白质功能有关,例如,根据功能而不是同源分组评估BLAST序列或开发针对轮廓模型的训练集。我们进行这项研究的动机是对蛋白质之间的序列和功能相似性进行建模,以深入了解“蛋白质之间的序列-功能相似性关系以预测功能。使用我们的模型,我们发现功能相似性通常随序列相似性而增加,但高度相似。该结果对成对方法有影响,因为看起来序列相似性必须很高以确保高度的功能相似性,能够提高灵敏度的简档模型是一个潜在的解决方案,但是,多个序列比对(必要的先决条件)是一种问题在于目前的算法难以比对序列相似度很低的序列,这在我们的数据集中很常见,或者对于大量序列而言都是难以解决的。鉴于预测蛋白质功能的重要性和对多个序列比对的需求,该任务应进一步完善和发展。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号