首页> 外文期刊>Scientific reports. >Using Two-dimensional Principal Component Analysis and Rotation Forest for Prediction of Protein-Protein Interactions
【24h】

Using Two-dimensional Principal Component Analysis and Rotation Forest for Prediction of Protein-Protein Interactions

机译:使用二维主成分分析和旋转林进行蛋白质 - 蛋白质相互作用的预测

获取原文
获取外文期刊封面目录资料

摘要

The interaction among proteins is essential in all life activities, and it is the basis of all the metabolic activities of the cells. By studying the protein-protein interactions (PPIs), people can better interpret the function of protein, decoding the phenomenon of life, especially in the design of new drugs with great practical value. Although many high-throughput techniques have been devised for large-scale detection of PPIs, these methods are still expensive and time-consuming. For this reason, there is a much-needed to develop computational methods for predicting PPIs at the entire proteome scale. In this article, we propose a new approach to predict PPIs using Rotation Forest (RF) classifier combine with matrix-based protein sequence. We apply the Position-Specific Scoring Matrix (PSSM), which contains biological evolution information, to represent protein sequences and extract the features through the two-dimensional Principal Component Analysis (2DPCA) algorithm. The descriptors are then sending to the rotation forest classifier for classification. We obtained 97.43% prediction accuracy with 94.92% sensitivity at the precision of 99.93% when the proposed method was applied to the PPIs data of yeast . To evaluate the performance of the proposed method, we compared it with other methods in the same dataset, and validate it on an independent datasets. The results obtained show that the proposed method is an appropriate and promising method for predicting PPIs.
机译:蛋白质之间的相互作用在所有生命活动中都是必不可少的,并且它是细胞所有代谢活性的基础。通过研究蛋白质 - 蛋白质相互作用(PPI),人们可以更好地解释蛋白质的功能,解释生活现象,特别是在具有良好实用价值的新药的设计中。虽然已经设计了许多高通量技术进行了大规模检测PPI,但这些方法仍然昂贵且耗时。因此,有许可以开发用于在整个蛋白质组规模上预测PPI的计算方法。在本文中,我们提出了一种新方法来使用旋转林(RF)分类器与基于基于基于基于基于基于基于基于基于蛋白质的蛋白质序列来预测PPI。我们应用了特定的特异性评分矩阵(PSSM),其包含生物学演变信息,以代表蛋白质序列并通过二维主成分分析(2DPCA)算法提取特征。然后,描述符将发送到旋转林分类器以进行分类。当拟议的方法应用于酵母的PPI数据时,我们获得了94.43%的预测准确度,精度为99.93%。为了评估所提出的方法的性能,我们将其与同一数据集中的其他方法进行比较,并在独立数据集上验证。得到的结果表明,该方法是一种适当且有希望的方法来预测PPI。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号