首页> 外文期刊>BMC Bioinformatics >Sequence-based bacterial small RNAs prediction using ensemble learning strategies
【24h】

Sequence-based bacterial small RNAs prediction using ensemble learning strategies

机译:基于序列的细菌小RNAS使用集合学习策略预测

获取原文

摘要

Bacterial small non-coding RNAs (sRNAs) have emerged as important elements in diverse physiological processes, including growth, development, cell proliferation, differentiation, metabolic reactions and carbon metabolism, and attract great attention. Accurate prediction of sRNAs is important and challenging, and helps to explore functions and mechanism of sRNAs. In this paper, we utilize a variety of sRNA sequence-derived features to develop ensemble learning methods for the sRNA prediction. First, we compile a balanced dataset and four imbalanced datasets. Then, we investigate various sRNA sequence-derived features, such as spectrum profile, mismatch profile, reverse compliment k-mer and pseudo nucleotide composition. Finally, we consider two ensemble learning strategies to integrate all features for building ensemble learning models for the sRNA prediction. One is the weighted average ensemble method (WAEM), which uses the linear weighted sum of outputs from the individual feature-based predictors to predict sRNAs. The other is the neural network ensemble method (NNEM), which trains a deep neural network by combining diverse features. In the computational experiments, we evaluate our methods on these five datasets by using 5-fold cross validation. WAEM and NNEM can produce better results than existing state-of-the-art sRNA prediction methods. WAEM and NNEM have great potential for the sRNA prediction, and are helpful for understanding the biological mechanism of bacteria.
机译:细菌小型非编码RNA(SRNA)被出现为各种生理过程中的重要元素,包括生长,发育,细胞增殖,分化,代谢反应和碳代谢,并引起了极大的关注。准确的SRNA预测是重要的和具有挑战性的,有助于探索SRNA的功能和机制。在本文中,我们利用各种SRNA序列导出的特征来开发SRNA预测的集合学习方法。首先,我们编译一个平衡的数据集和四个不平衡数据集。然后,我们研究了各种SRNA序列衍生的特征,例如光谱曲线,失配曲线,反向恭维K-MER和假核苷酸组合物。最后,我们考虑了两个集合学习策略,以集成为SRNA预测构建集合学习模型的所有功能。一个是加权平均集合方法(WAEM),其使用来自基于各个特征的预测器的线性加权和从基于特征的预测器来预测SRNA。另一个是神经网络集合方法(NNEM),通过组合各种特征来列举深层神经网络。在计算实验中,我们通过使用5倍交叉验证来评估我们在这五个数据集上的方法。 WAEM和NNEM可以产生比现有的最先进的SRNA预测方法更好的结果。 WAEM和NNEM对SRNA预测有很大的潜力,并且有助于理解细菌的生物机制。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号