首页> 外文期刊>BMC Systems Biology >Improved protein-protein interactions prediction via weighted sparse representation model combining continuous wavelet descriptor and PseAA composition
【24h】

Improved protein-protein interactions prediction via weighted sparse representation model combining continuous wavelet descriptor and PseAA composition

机译:结合连续小波描述符和PseAA组成的加权稀疏表示模型改进的蛋白质-蛋白质相互作用预测

获取原文
       

摘要

Background Protein-protein interactions (PPIs) are essential to most biological processes. Since bioscience has entered into the era of genome and proteome, there is a growing demand for the knowledge about PPI network. High-throughput biological technologies can be used to identify new PPIs, but they are expensive, time-consuming, and tedious. Therefore, computational methods for predicting PPIs have an important role. For the past years, an increasing number of computational methods such as protein structure-based approaches have been proposed for predicting PPIs. The major limitation in principle of these methods lies in the prior information of the protein to infer PPIs. Therefore, it is of much significance to develop computational methods which only use the information of protein amino acids sequence. Results Here, we report a highly efficient approach for predicting PPIs. The main improvements come from the use of a novel protein sequence representation by combining continuous wavelet descriptor and Chou’s pseudo amino acid composition (PseAAC), and from adopting weighted sparse representation based classifier (WSRC). This method, cross-validated on the PPIs datasets of Saccharomyces cerevisiae , Human and H. pylori , achieves an excellent results with accuracies as high as 92.50%, 95.54% and 84.28% respectively, significantly better than previously proposed methods. Extensive experiments are performed to compare the proposed method with state-of-the-art Support Vector Machine (SVM) classifier. Conclusions The outstanding results yield by our model that the proposed feature extraction method combing two kinds of descriptors have strong expression ability and are expected to provide comprehensive and effective information for machine learning-based classification models. In addition, the prediction performance in the comparison experiments shows the well cooperation between the combined feature and WSRC. Thus, the proposed method is a very efficient method to predict PPIs and may be a useful supplementary tool for future proteomics studies.
机译:背景技术蛋白质-蛋白质相互作用(PPI)对于大多数生物过程都是必不可少的。由于生物科学已进入基因组和蛋白质组学的时代,对PPI网络知识的需求不断增长。高通量生物技术可用于识别新的PPI,但它们昂贵,费时且繁琐。因此,预测PPI的计算方法具有重要作用。在过去的几年中,已经提出了越来越多的计算方法,例如基于蛋白质结构的方法来预测PPI。这些方法原则上的主要局限在于蛋白质推断PPI的先验信息。因此,开发仅利用蛋白质氨基酸序列信息的计算方法具有重要意义。结果在这里,我们报告了一种预测PPI的高效方法。主要的改进来自通过结合连续小波描述符和Chou的伪氨基酸成分(PseAAC)使用新颖的蛋白质序列表示,以及采用基于加权稀疏表示的分类器(WSRC)。该方法在啤酒酵母,人和幽门螺杆菌的PPI数据集上进行交叉验证,取得了优异的结果,其准确率分别高达92.50%,95.54%和84.28%,明显优于以前提出的方法。进行了广泛的实验,以将所提出的方法与最新的支持向量机(SVM)分类器进行比较。结论通过我们的模型可以得出优异的结果,即所提出的结合两种描述符的特征提取方法具有很强的表达能力,并有望为基于机器学习的分类模型提供全面而有效的信息。此外,比较实验中的预测性能显示了组合特征与WSRC之间的良好协作。因此,提出的方法是预测PPI的非常有效的方法,并且可能是将来蛋白质组学研究的有用补充工具。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号