首页> 外文期刊>Medicinal chemistry >pLoc_bal-mEuk: Predict Subcellular Localization of Eukaryotic Proteins by General PseAAC and Quasi-balancing Training Dataset
【24h】

pLoc_bal-mEuk: Predict Subcellular Localization of Eukaryotic Proteins by General PseAAC and Quasi-balancing Training Dataset

机译:Ploc_bal-meuk:通过PSEAAC和准平衡训练数据集预测真核蛋白的亚细胞定位

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Background/Objective: Information of protein subcellular localization is crucially important for both basic research and drug development. With the explosive growth of protein sequences discovered in the post-genomic age, it is highly demanded to develop powerful bioinformatics tools for timely and effectively identifying their subcellular localization purely based on the sequence information alone. Recently, a predictor called "pLoc-mEuk" was developed for identifying the subcellular localization of eukaryotic proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems where many proteins, called "multiplex proteins", may simultaneously occur in two or more subcellular locations. Although it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mEuk was trained by an extremely skewed dataset where some subset was about 200 times the size of the other subsets. Accordingly, it cannot avoid the biased consequence caused by such an uneven training dataset.
机译:背景/目的:蛋白质亚细胞定位的信息对于基础研究和药物开发至关重要。随着在后期后时代发现的蛋白质序列的爆炸性生长,纯粹是基于单独的序列信息来及时,有效地识别其亚细胞定位的强大生物信息工具。最近,开发了一种称为“ploc-meuk”的预测因子,用于鉴定真核蛋白的亚细胞定位。其性能绝对比其他目的更好地更大,特别是在处理许多蛋白质,称为“多重蛋白质”的多标签系统中,可以在两个或更多个亚细胞位置发生多标签系统。虽然确实是一个非常强大的预测因子,但绝对需要更多的努力来进一步改善它。这是因为Ploc-Meuk是由一个非常偏置的数据集接受训练,其中一些子集是其他子集大小的200倍。因此,它无法避免由这种不均匀的训练数据集引起的偏置后果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号