首页> 外文期刊>Computational and mathematical methods in medicine >iT3SE-PX: Identification of Bacterial Type III Secreted Effectors Using PSSM Profiles and XGBoost Feature Selection
【24h】

iT3SE-PX: Identification of Bacterial Type III Secreted Effectors Using PSSM Profiles and XGBoost Feature Selection

机译:IT3SE-PX:使用PSSM配置文件和XGBoost特征选择识别细菌III分泌效果。

获取原文
           

摘要

Identification of bacterial type III secreted effectors (T3SEs) has become a popular research topic in the field of bioinformatics due to its crucial role in understanding host-pathogen interaction and developing better therapeutic targets against the pathogens. However, the recognition of all effector proteins by using traditional experimental approaches is often time-consuming and laborious. Therefore, development of computational methods to accurately predict putative novel effectors is important in reducing the number of biological experiments for validation. In this study, we proposed a method, called iT3SE-PX, to identify T3SEs solely based on protein sequences. First, three kinds of features were extracted from the position-specific scoring matrix (PSSM) profiles to help train a machine learning (ML) model. Then, the extreme gradient boosting (XGBoost) algorithm was performed to rank these features based on their classification ability. Finally, the optimal features were selected as inputs to a support vector machine (SVM) classifier to predict T3SEs. Based on the two benchmark datasets, we conducted a 100-time randomized 5-fold cross validation (CV) and an independent test, respectively. The experimental results demonstrated that the proposed method achieved superior performance compared to most of the existing methods and could serve as a useful tool for identifying putative T3SEs, given only the sequence information.
机译:由于其在理解宿主病原体相互作用以及对病原体的更好治疗靶标的作用,细菌III型分泌效应器(T3SES)的鉴定已成为生物信息学领域的流行研究课题。然而,通过使用传统的实验方法识别所有效应蛋白通常是耗时和费力的。因此,在减少验证的生物实验的数量方面,开发准确预测推定的新型效果的计算方法是重要的。在本研究中,我们提出了一种称为IT3Se-PX的方法,仅基于蛋白质序列来识别T3SES。首先,从位置特定的评分矩阵(PSSM)轮廓中提取了三种特征,以帮助培训机器学习(ML)模型。然后,执行极端梯度升压(XGBoost)算法基于其分类能力对这些特征进行校准。最后,选择最佳特征作为支持向量机(SVM)分类器的输入以预测T3SES。基于两个基准数据集,我们分别进行了100次随机的5倍交叉验证(CV)和独立测试。实验结果表明,与大多数现有方法相比,所提出的方法实现了卓越的性能,并且可以作为识别推定T3SES的有用工具,仅给出序列信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号