首页> 外文期刊>Amino acids >Using Chou's pseudo amino acid composition based on approximate entropy and an ensemble of AdaBoost classifiers to predict protein subnuclear location
【24h】

Using Chou's pseudo amino acid composition based on approximate entropy and an ensemble of AdaBoost classifiers to predict protein subnuclear location

机译:使用基于近似熵的周氏伪氨基酸组成和AdaBoost分类器的集合来预测蛋白质亚核位置

获取原文
获取原文并翻译 | 示例
           

摘要

The knowledge of subnuclear localization in eukaryotic cells is essential for understanding the life function of nucleus.Developing prediction methods and tools for proteins subnuclear localization become important research fields in protein science for special characteristics in cell nuclear.In this study,a novel approach has been proposed to predict protein subnuclear localization.Sample of protein is represented by Pseudo Amino Acid(PseAA)composition based on approximate entropy(ApEn)concept,which reflects the complexity of time series.A novel ensemble classifier is designed incorporating three Ada-Boost classifiers.The base classifier algorithms in three AdaBoost are decision stumps,fuzzy K nearest neighbors classifier,and radial basis-support vector machines,respectively.Different PseAA compositions are used as input data of different AdaBoost classifier in ensemble.Genetic algorithm is used to optimize the dimension and weight factor of PseAA composition.Two datasets often used in published works are used to validate the performance of the proposed approach.The obtained results of Jackknife cross-validation test are higher and more balance than them of other methods on same datasets.The promising results indicate that the proposed approach is effective and practical.It might become a useful tool in protein subnuclear localization.The software in Matlab and supplementary materials are available freely by contacting the corresponding author.
机译:真核细胞中亚核定位的知识对于理解核的生命功能至关重要。开发蛋白质亚核定位的预测方法和工具成为细胞科学中具有细胞核特殊特征的重要研究领域。提出了一种预测蛋白质亚核定位的方法。蛋白质样品由基于近似熵(ApEn)概念的伪氨基酸(PseAA)组成表示,反映了时间序列的复杂性。设计了一种集成了三个Ada-Boost分类器的集成分类器。三个AdaBoost中的基本分类器算法分别是决策树桩,模糊K最近邻分类器和径向基支持向量机。将不同的PseAA组成用作集合中不同AdaBoost分类器的输入数据。遗传算法用于优化维度和PseAA组成的权重因子。在相同的数据集上,n次发表的著作被用来验证该方法的性能.Jackknife交叉验证测试的结果比其他方法更高,更平衡。它可能会成为蛋白质亚核定位的有用工具。Matlab中的软件和补充材料可通过与相应的作者联系免费获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号