首页> 外文期刊>Decision support systems >Combining multiple feature selection methods for stock prediction: Union, intersection, and multi-intersection approaches
【24h】

Combining multiple feature selection methods for stock prediction: Union, intersection, and multi-intersection approaches

机译:结合多种特征选择方法进行库存预测:联合,相交和多相交方法

获取原文
获取原文并翻译 | 示例
       

摘要

To effectively predict stock price for investors is a very important research problem. In literature, data mining techniques have been applied to stock (market) prediction. Feature selection, a pre-processing step of data mining, aims at filtering out unrepresentative variables from a given dataset for effective prediction. As using different feature selection methods will lead to different features selected and thus affect the prediction performance, the purpose of this paper is to combine multiple feature selection methods to identify more representative variables for better prediction. In particular, three well-known feature selection methods, which are Principal Component Analysis (PCA), Genetic Algorithms (GA) and decision trees (CART), are used. The combination methods to filter out unrepresentative variables are based on union, intersection, and multi-intersection strategies. For the prediction model, the back-propagation neural network is developed. Experimental results show that the intersection between PCA and GA and the multi-intersection of PCA, GA, and CART perform the best, which are of 79% and 78.98% accuracy respectively. In addition, these two combined feature selection methods filter out near 80% unrepresentative features from 85 original variables, resulting in 14 and 17 important features respectively. These variables are the important factors for stock prediction and can be used for future investment decisions.
机译:有效地为投资者预测股价是一个非常重要的研究问题。在文献中,数据挖掘技术已应用于股票(市场)预测。特征选择是数据挖掘的预处理步骤,旨在从给定的数据集中过滤出不具代表性的变量,以进行有效的预测。由于使用不同的特征选择方法将导致选择不同的特征,从而影响预测性能,因此本文的目的是结合多种特征选择方法来识别更具代表性的变量,以实现更好的预测。特别地,使用了三种众所周知的特征选择方法,即主成分分析(PCA),遗传算法(GA)和决策树(CART)。过滤掉代表变量的组合方法基于联合,相交和多相交策略。对于预测模型,开发了反向传播神经网络。实验结果表明,PCA和GA的交点以及PCA,GA和CART的多重交点表现最好,分别达到79%和78.98%的精度。此外,这两种组合的特征选择方法从85个原始变量中滤除了近80%的非代表性特征,分别产生了14个和17个重要特征。这些变量是库存预测的重要因素,可用于将来的投资决策。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号