首页> 美国卫生研究院文献>Scientific Reports >Using Two-dimensional Principal Component Analysis and Rotation Forest for Prediction of Protein-Protein Interactions
【2h】

Using Two-dimensional Principal Component Analysis and Rotation Forest for Prediction of Protein-Protein Interactions

机译:使用二维主成分分析和旋转森林预测蛋白质-蛋白质相互作用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The interaction among proteins is essential in all life activities, and it is the basis of all the metabolic activities of the cells. By studying the protein-protein interactions (PPIs), people can better interpret the function of protein, decoding the phenomenon of life, especially in the design of new drugs with great practical value. Although many high-throughput techniques have been devised for large-scale detection of PPIs, these methods are still expensive and time-consuming. For this reason, there is a much-needed to develop computational methods for predicting PPIs at the entire proteome scale. In this article, we propose a new approach to predict PPIs using Rotation Forest (RF) classifier combine with matrix-based protein sequence. We apply the Position-Specific Scoring Matrix (PSSM), which contains biological evolution information, to represent protein sequences and extract the features through the two-dimensional Principal Component Analysis (2DPCA) algorithm. The descriptors are then sending to the rotation forest classifier for classification. We obtained 97.43% prediction accuracy with 94.92% sensitivity at the precision of 99.93% when the proposed method was applied to the PPIs data of yeast. To evaluate the performance of the proposed method, we compared it with other methods in the same dataset, and validate it on an independent datasets. The results obtained show that the proposed method is an appropriate and promising method for predicting PPIs.
机译:蛋白质之间的相互作用在所有生命活动中都是必不可少的,并且是细胞所有新陈代谢活动的基础。通过研究蛋白质相互作用,人们可以更好地解释蛋白质的功能,解码生命现象,特别是在具有重大实用价值的新药设计中。尽管已设计出许多用于大批量检测PPI的高通量技术,但这些方法仍然昂贵且耗时。因此,迫切需要开发一种计算方法来预测整个蛋白质组规模的PPI。在本文中,我们提出了一种使用旋转森林(RF)分类器结合基于基质的蛋白质序列来预测PPI的新方法。我们应用包含生物进化信息的特定位置评分矩阵(PSSM)来表示蛋白质序列,并通过二维主成分分析(2DPCA)算法提取特征。然后将描述符发送到旋转森林分类器进行分类。当该方法应用于酵母的PPIs数据时,预测精度为97.43%,灵敏度为94.92%,精度为99.93%。为了评估该方法的性能,我们将其与同一数据集中的其他方法进行了比较,并在独立的数据集中对其进行了验证。获得的结果表明,该方法是预测PPI的合适方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号