...
首页> 外文期刊>BMC Bioinformatics >PlncRNA-HDeep: plant long noncoding RNA prediction using hybrid deep learning based on two encoding styles
【24h】

PlncRNA-HDeep: plant long noncoding RNA prediction using hybrid deep learning based on two encoding styles

机译:PLNCRNA-HDEEP:使用基于两个编码样式的混合深度学习的植物长度非编码RNA预测

获取原文

摘要

Long noncoding RNAs (lncRNAs) play an important role in regulating biological activities and their prediction is significant for exploring biological processes. Long short-term memory (LSTM) and convolutional neural network (CNN) can automatically extract and learn the abstract information from the encoded RNA sequences to avoid complex feature engineering. An ensemble model learns the information from multiple perspectives and shows better performance than a single model. It is feasible and interesting that the RNA sequence is considered as sentence and image to train LSTM and CNN respectively, and then the trained models are hybridized to predict lncRNAs. Up to present, there are various predictors for lncRNAs, but few of them are proposed for plant. A reliable and powerful predictor for plant lncRNAs is necessary. To boost the performance of predicting lncRNAs, this paper proposes a hybrid deep learning model based on two encoding styles (PlncRNA-HDeep), which does not require prior knowledge and only uses RNA sequences to train the models for predicting plant lncRNAs. It not only learns the diversified information from RNA sequences encoded by p-nucleotide and one-hot encodings, but also takes advantages of lncRNA-LSTM proposed in our previous study and CNN. The parameters are adjusted and three hybrid strategies are tested to maximize its performance. Experiment results show that PlncRNA-HDeep is more effective than lncRNA-LSTM and CNN and obtains 97.9% sensitivity, 95.1% precision, 96.5% accuracy and 96.5% F1 score on Zea mays dataset which are better than those of several shallow machine learning methods (support vector machine, random forest, k-nearest neighbor, decision tree, naive Bayes and logistic regression) and some existing tools (CNCI, PLEK, CPC2, LncADeep and lncRNAnet). PlncRNA-HDeep is feasible and obtains the credible predictive results. It may also provide valuable references for other related research.
机译:长度非编码RNA(LNCRNA)在调节生物活动中发挥重要作用,并且它们的预测对于探索生物学过程很重要。长短期内存(LSTM)和卷积神经网络(CNN)可以自动提取和学习来自编码的RNA序列的抽象信息,以避免复杂的特征工程。集合模型从多个透视图中了解信息,并且显示出比单个模型更好的性能。可以分别认为RNA序列作为句子和图像分别被认为是句子和图像的可行性和有趣,然后训练的型号杂交以预测LNCRNA。截至目前,LNCRNA有各种预测因子,但其中很少有植物。植物LNCRNA的可靠和强大的预测因子是必要的。为了提高预测LNCRNA的性能,本文提出了一种基于两个编码风格(PLNCRNA-HDEEP)的混合深度学习模型,其不需要先验知识,并且仅使用RNA序列培训预测植物LNCRNA的模型。它不仅从P核苷酸和单热编码编码的RNA序列中学习多样化信息,而且还需要在我们之前的研究中提出的LNCRNA-LSTM和CNN。调整参数并测试三个混合策略以最大化其性能。实验结果表明,PLNCRNA-HDEEP比LNCRNA-LSTM和CNN更有效,获得97.9%的灵敏度,精度为95.1%,精度为96.5%,精度为96.5%,ZEA可能是多浅机器学习方法的数据集(支持向量机,随机森林,k最近邻居,决策树,天真贝叶斯和逻辑回归)和一些现有工具(CNCI,PLEK,CPC2,LNCADEEP和LNCRNANET)。 PlNcRNA-HDeep是可行的,获得可靠的预测结果。它还可以为其他相关研究提供有价值的参考。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号