首页> 美国卫生研究院文献>International Journal of Molecular Sciences >DeepPred-SubMito: A Novel Submitochondrial Localization Predictor Based on Multi-Channel Convolutional Neural Network and Dataset Balancing Treatment
【2h】

DeepPred-SubMito: A Novel Submitochondrial Localization Predictor Based on Multi-Channel Convolutional Neural Network and Dataset Balancing Treatment

机译:DeepPred-Subsito:基于多通道卷积神经网络和数据集平衡处理的新型提交定位预测器

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Mitochondrial proteins are physiologically active in different compartments, and their abnormal location will trigger the pathogenesis of human mitochondrial pathologies. Correctly identifying submitochondrial locations can provide information for disease pathogenesis and drug design. A mitochondrion has four submitochondrial compartments, the matrix, the outer membrane, the inner membrane, and the intermembrane space, but various existing studies ignored the intermembrane space. The majority of researchers used traditional machine learning methods for predicting mitochondrial protein localization. Those predictors required expert-level knowledge of biology to be encoded as features rather than allowing the underlying predictor to extract features through a data-driven procedure. Besides, few researchers have considered the imbalance in datasets. In this paper, we propose a novel end-to-end predictor employing deep neural networks, DeepPred-SubMito, for protein submitochondrial location prediction. First, we utilize random over-sampling to decrease the influence caused by unbalanced datasets. Next, we train a multi-channel bilayer convolutional neural network for multiple subsequences to learn high-level features. Third, the prediction result is outputted through the fully connected layer. The performance of the predictor is measured by 10-fold cross-validation and 5-fold cross-validation on the SM424-18 dataset and the SubMitoPred dataset, respectively. Experimental results show that the predictor outperforms state-of-the-art predictors. In addition, the prediction of results in the M983 dataset also confirmed its effectiveness in predicting submitochondrial locations.
机译:线粒体蛋白质在不同隔室的生理活性活性,它们的异常位置将引发人体线粒体病理的发病机制。正确识别提交的位置可以提供疾病发病机制和药物设计的信息。线粒体有四个提让子子里隔室,基质,外膜,内膜和内膜空间,但各种现有研究忽略了膜间空间。大多数研究人员使用传统的机器学习方法来预测线粒体蛋白质定位。这些预测器需要将生物学的专家级知识作为特征编码,而不是允许底层预测器通过数据驱动过程提取特征。此外,很少有研究人员考虑过数据集的不平衡。在本文中,我们提出了一种新的端到端预测因子,用于蛋白质排列位置预测。首先,我们利用随机上采样来减少由不平衡数据集造成的影响。接下来,我们训练多通道双层卷积神经网络,用于多个子序列以学习高级功能。第三,预测结果通过完全连接的图层输出。预测器的性能分别通过10倍的交叉验证和SM424-18 DataSet和SubsiteOpred DataSet的5倍交叉验证来测量。实验结果表明,预测器优于最先进的预测因子。此外,M983数据集的结果预测还证实了其在预测提交位置的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号