首页> 美国卫生研究院文献>other >iBCE-EL: A New Ensemble Learning Framework for Improved Linear B-Cell Epitope Prediction
【2h】

iBCE-EL: A New Ensemble Learning Framework for Improved Linear B-Cell Epitope Prediction

机译:iBCE-EL:用于改进线性B细胞表位预测的新集成学习框架

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Identification of B-cell epitopes (BCEs) is a fundamental step for epitope-based vaccine development, antibody production, and disease prevention and diagnosis. Due to the avalanche of protein sequence data discovered in postgenomic age, it is essential to develop an automated computational method to enable fast and accurate identification of novel BCEs within vast number of candidate proteins and peptides. Although several computational methods have been developed, their accuracy is unreliable. Thus, developing a reliable model with significant prediction improvements is highly desirable. In this study, we first constructed a non-redundant data set of 5,550 experimentally validated BCEs and 6,893 non-BCEs from the Immune Epitope Database. We then developed a novel ensemble learning framework for improved linear BCE predictor called iBCE-EL, a fusion of two independent predictors, namely, extremely randomized tree (ERT) and gradient boosting (GB) classifiers, which, respectively, uses a combination of physicochemical properties (PCP) and amino acid composition and a combination of dipeptide and PCP as input features. Cross-validation analysis on a benchmarking data set showed that iBCE-EL performed better than individual classifiers (ERT and GB), with a Matthews correlation coefficient (MCC) of 0.454. Furthermore, we evaluated the performance of iBCE-EL on the independent data set. Results show that iBCE-EL significantly outperformed the state-of-the-art method with an MCC of 0.463. To the best of our knowledge, iBCE-EL is the first ensemble method for linear BCEs prediction. iBCE-EL was implemented in a web-based platform, which is available at . iBCE-EL contains two prediction modes. The first one identifying peptide sequences as BCEs or non-BCEs, while later one is aimed at providing users with the option of mining potential BCEs from protein sequences.
机译:B细胞表位(BCE)的鉴定是基于表位的疫苗开发,抗体生产以及疾病预防和诊断的基本步骤。由于在后基因组时代发现了大量的蛋白质序列数据,因此必须开发一种自动化的计算方法,以便能够快速,准确地识别大量候选蛋白质和肽中的新型BCE。尽管已经开发了几种计算方法,但是它们的准确性是不可靠的。因此,非常需要开发一种具有重大预测改进的可靠模型。在这项研究中,我们首先从免疫表位数据库构建了一个5550个经过实验验证的BCE和6893个非BCE的非冗余数据集。然后,我们开发了一种新的集成学习框架,用于改进的线性BCE预测变量iBCE-EL,这是两个独立预测变量的融合,即极端随机树(ERT)和梯度增强(GB)分类器,它们分别使用了物理化学方法特性(PCP)和氨基酸组成,以及二肽和PCP的组合作为输入特征。对基准数据集的交叉验证分析表明,iBCE-EL的表现优于单个分类器(ERT和GB),其马修斯相关系数(MCC)为0.454。此外,我们在独立数据集上评估了iBCE-EL的性能。结果表明,iBCE-EL的MCC为0.463,明显优于最新方法。据我们所知,iBCE-EL是第一个用于线性BCE预测的集成方法。 iBCE-EL是在基于Web的平台上实现的,该平台可从访问。 iBCE-EL包含两种预测模式。第一个将肽序列识别为BCE或非BCE,而后一个旨在为用户提供从蛋白质序列中挖掘潜在BCE的选择。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号