首页> 外文会议>International Conference on Bioinformatics and Biomedical Engineering >A Computational Domain-Based Feature Grouping Approach for Prediction of Stability of SCF Ligases
【24h】

A Computational Domain-Based Feature Grouping Approach for Prediction of Stability of SCF Ligases

机译:基于计算域的特征分组方法,用于预测SCF连接件的稳定性

获取原文

摘要

Analyzing the stability of SCF ubiquitin ligases is worth investigating because these complexes are involved in many cellular processes including cell cycle regulation, DNA repair mechanisms, and gene expression. On the other hand, interactions of two (or more) proteins are controlled by their domains -compact functional units of proteins. As a consequence, in this study, we have analyzed the role of Pfam domain interactions in predicting the stability of protein-protein interactions (PPIs) that are known or predicted to occur involving subunit components of the SCF ligase complex. Moreover, employing the most relevant and discriminating features is very important to achieve a successful prediction with low computational cost. Although, different feature selection methods have been recently developed for this purpose, feature grouping is a better idea, especially when dealing with high-dimensional sparse feature vectors, yielding better interpretation of the data. In this paper, a correlation-based feature grouping (CFG) method is proposed to group and combine the features. To demonstrate the strength of CFG, two filter methods of χ~2 and correlation are also employed for feature selection and prediction is performed using different methods including a support vector machine (SVM) and k-Nearest Neighbor (k-NN). The experimental results on a dataset of SCF ligases indicate that employing feature grouping achieves significant increases of 10% for svm and 13% for k-NN, being more efficient than employing feature selection in identifying a set of relevant features.
机译:分析SCF泛素连接酶的稳定性值得研究,因为这些配合物涉及许多细胞过程,包括细胞周期调节,DNA修复机制和基因表达。另一方面,两种(或更多)蛋白的相互作用由它们的结构域控制 - 蛋白质的函数单位。因此,在本研究中,我们分析了PFAM结构域相互作用在预测已知或预测的蛋白质 - 蛋白质相互作用(PPI)的稳定性发生的作用,涉及SCF连接酶复合物的亚基组分。此外,采用最相关和最相关的功能对于实现具有低计算成本的成功预测非常重要。尽管最近为此目的开发了不同的特征选择方法,但特征分组是一个更好的想法,特别是在处理高维稀疏特征向量时,产生更好地解释数据。在本文中,提出了基于相关的特征分组(CFG)方法对组进行组并结合特征。为了证明CFG的强度,还采用两个滤波方法和相关的相关方法用于特征选择和使用包括支持向量机(SVM)和K最近邻(K-NN)的不同方法进行预测。 SCF连接酶数据集上的实验结果表明,使用特征分组对于SVM的显着增加10%,对于K-NN而言,比使用特征选择更有效地识别一组相关特征。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号