首页> 外文会议>2017 XLIII Latin American Computer Conference >Extreme learning machine prediction under high class imbalance in bioinformatics
【24h】

Extreme learning machine prediction under high class imbalance in bioinformatics

机译:生物信息学中高度不平衡下的极限学习机预测

获取原文
获取原文并翻译 | 示例

摘要

Class imbalance in machine learning is when there are significantly fewer training instances of one class in comparison to another one. In bioinformatics, there is such a problem in the computational prediction of novel microRNA (miRNAs) within a full genome. The well-known precursors miRNA (pre-miRNA) are usually only a few in comparison to the hundreds of thousands of potential candidates, which makes this task a high class imbalance classification problem. It is well-known that high class imbalance usually affects any classical supervised machine learning classifier. Thus the imbalance must be explicitly considered. Extreme Learning Machine (ELM) is a supervised artificial neural network model that has gained interest in the last years because of its high learning rate and performance. In this work, we propose a novel approach to overcome the high class imbalance in pre-miRNAs prediction data in which ELMs are used for predicting good candidates to pre-miRNA, without needing balanced data sets. Real datasets were used for validation of the proposal with several class imbalance levels. The results obtained showed the superiority of the ELM approach against very recent state-of-the-art methods in the same experimental conditions.
机译:机器学习中的课堂失衡是指与另一门课程相比,一门课程的训练实例少得多。在生物信息学中,在完整基因组内的新型microRNA(miRNA)的计算预测中存在这样的问题。与成千上万的潜在候选物相比,众所周知的前体miRNA(pre-miRNA)通常很少,这使该任务成为高级不平衡分类问题。众所周知,高级不平衡通常会影响任何经典的监督式机器学习分类器。因此,必须明确考虑这种不平衡。极限学习机(Extreme Learning Machine,ELM)是一种受监督的人工神经网络模型,由于其高学习率和高性能而在最近几年受到关注。在这项工作中,我们提出了一种新颖的方法来克服pre-miRNA预测数据中的高级不平衡现象,其中ELM用于预测pre-miRNA的良好候选者,而无需平衡的数据集。实际数据集用于验证具有多个类别不平衡级别的提案。所获得的结果表明,在相同的实验条件下,ELM方法相对于最新技术而言具有优越性。

著录项

  • 来源
  • 会议地点 Cordoba(AR)
  • 作者单位

    Research Institute for Signals, Systems and Computational Intelligence (sinc(i)), FICH-UNL, CONICET, Argentina;

    Research Institute for Signals, Systems and Computational Intelligence (sinc(i)), FICH-UNL, CONICET, Argentina;

    Research Institute for Signals, Systems and Computational Intelligence (sinc(i)), FICH-UNL, CONICET, Argentina;

    Research Institute for Signals, Systems and Computational Intelligence (sinc(i)), FICH-UNL, CONICET, Argentina;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Training; Bioinformatics; Support vector machines; Neurons; Genomics;

    机译:培训;生物信息学;支持向量机;神经元;基因组学;;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号