...
首页> 外文期刊>BMC Bioinformatics >Critical assessment of sequence-based protein-protein interaction prediction methods that do not require homologous protein sequences
【24h】

Critical assessment of sequence-based protein-protein interaction prediction methods that do not require homologous protein sequences

机译:对不需要同源蛋白质序列的基于序列的蛋白质-蛋白质相互作用预测方法的严格评估

获取原文
           

摘要

Background Protein-protein interactions underlie many important biological processes. Computational prediction methods can nicely complement experimental approaches for identifying protein-protein interactions. Recently, a unique category of sequence-based prediction methods has been put forward - unique in the sense that it does not require homologous protein sequences. This enables it to be universally applicable to all protein sequences unlike many of previous sequence-based prediction methods. If effective as claimed, these new sequence-based, universally applicable prediction methods would have far-reaching utilities in many areas of biology research. Results Upon close survey, I realized that many of these new methods were ill-tested. In addition, newer methods were often published without performance comparison with previous ones. Thus, it is not clear how good they are and whether there are significant performance differences among them. In this study, I have implemented and thoroughly tested 4 different methods on large-scale, non-redundant data sets. It reveals several important points. First, significant performance differences are noted among different methods. Second, data sets typically used for training prediction methods appear significantly biased, limiting the general applicability of prediction methods trained with them. Third, there is still ample room for further developments. In addition, my analysis illustrates the importance of complementary performance measures coupled with right-sized data sets for meaningful benchmark tests. Conclusions The current study reveals the potentials and limits of the new category of sequence-based protein-protein interaction prediction methods, which in turn provides a firm ground for future endeavours in this important area of contemporary bioinformatics.
机译:背景技术蛋白质相互作用是许多重要的生物学过程的基础。计算预测方法可以很好地补充用于鉴定蛋白质-蛋白质相互作用的实验方法。最近,提出了一种独特的基于序列的预测方法-从不需要同源蛋白序列的意义上讲是独特的。与许多先前的基于序列的预测方法不同,这使它可以普遍适用于所有蛋白质序列。如果按照要求有效,那么这些基于序列的,普遍适用的预测方法将在生物学研究的许多领域中具有深远的实用性。结果经过仔细调查,我意识到许多新方法都未经测试。此外,经常发布较新的方法而没有与以前的方法进行性能比较。因此,不清楚它们的性能如何以及它们之间是否存在显着的性能差异。在这项研究中,我已经在大规模,非冗余数据集上实施并彻底测试了4种不同的方法。它揭示了几个要点。首先,注意到不同方法之间的显着性能差异。其次,通常用于训练预测方法的数据集似乎有明显的偏差,从而限制了用它们训练的预测方法的一般适用性。第三,仍有进一步发展的空间。此外,我的分析表明,对于有意义的基准测试,补充性能指标以及正确大小的数据集的重要性。结论当前的研究揭示了基于序列的蛋白质-蛋白质相互作用预测方法这一新类别的潜力和局限性,这反过来为当代生物信息学这一重要领域的未来研究提供了坚实的基础。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号