首页> 外文期刊>Computational management science >Machine-learning classifiers for imbalanced tornado data
【24h】

Machine-learning classifiers for imbalanced tornado data

机译:不平衡龙卷风数据的机器学习分类器

获取原文
获取原文并翻译 | 示例
           

摘要

Learning from imbalanced data, where the number of observations in one class is significantly larger than the ones in the other class, has gained considerable attention in the machine learning community. Assuming the difficulty in predicting each class is similar, most standard classifiers will tend to predict the majority class well. This study applies tornado data that are highly imbalanced, as they are rare events. The severe weather data used herein have thunderstorm circulations (mesocyclones) that produce tornadoes in approximately 6.7 % of the total number of observations. However, since tornadoes are high impact weather events, it is important to predict the minority class with high accuracy. In this study, we apply support vector machines (SVMs) and logistic regression with and without a midpoint threshold adjustment on the probabilistic outputs, random forest, and rotation forest for tornado prediction. Feature selection with SVM-recursive feature elimination was also performed to identify the most important features or variables for predicting tornadoes. The results showed that the threshold adjustment on SVMs provided better performance compared to other classifiers.
机译:从不平衡数据中学习,其中一类的观察次数明显多于另一类的观察次数,已经在机器学习社区中引起了相当大的关注。假设预测每个类别的难度相似,大多数标准分类器将倾向于很好地预测多数类别。这项研究适用于高度不平衡的龙卷风数据,因为它们是罕见事件。本文使用的恶劣天气数据具有雷暴环流(中气旋),产生的龙卷风约占观测总数的6.7%。但是,由于龙卷风是高影响天气事件,因此准确预测少数群体非常重要。在这项研究中,我们对概率输出,随机森林和旋转森林应用龙卷风预测的支持向量机(SVM)和logistic回归(有无中点阈值调整)。还执行了具有SVM递归特征消除功能的特征选择,以识别预测龙卷风的最重要特征或变量。结果表明,与其他分类器相比,对SVM的阈值调整提供了更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号