首页> 外文会议>IEEE Symposium on Computer Applications and Industrial Electronics >An improved genetic algorithm for feature selection in the classification of Disaster-related Twitter messages
【24h】

An improved genetic algorithm for feature selection in the classification of Disaster-related Twitter messages

机译:一种改进的遗传算法,用于与灾害相关的Twitter消息分类中的特征选择

获取原文

摘要

In text classification with machine learning, utilizing terms as features using vector space representation can result in the high dimensionality of feature space. This condition introduces problems including high computational cost in data analysis, as well as degradation of classification accuracy. This study improved classifier's performance in the classification of natural crisis-related Twitter messages. Feature space dimensionality through feature selection was reduced using Genetic Algorithm (GA). While there is a limitation of GA implementation in text feature selection which is the premature convergence due to lack of population diversity in the subsequent generations, GA was enhanced in its crossover operator through: (a) setting a variable slice-point on the size of genes to be swapped for every offspring creation, (b) using features' frequency scores in deciding the swapping of genes. Several Twitter datasets were tested applying the algorithm enhancement and performed a comparative analysis with two standard GA implementation that uses a single-point and multi-point crossover. Experimental results showed the superiority of the enhanced GA in terms of reducing the number of selected features and in improving classification accuracy using Multinomial Naive Bayes.
机译:在带有机器学习的文本分类中,使用向量空间表示法将术语用作特征可以导致特征空间的高维化。这种情况带来的问题包括数据分析中的高计算成本以及分类准确性的降低。这项研究提高了分类器在与自然危机相关的Twitter消息分类中的性能。使用遗传算法(GA)降低了通过特征选择产生的特征空间维数。虽然在文本特征选择中GA的实施存在局限性,这是由于后代缺乏种群多样性而导致的过早收敛,但GA的交叉算子通过以下方式得到了增强:(a)在每个后代创造所要交换的基因,(b)使用特征的频率得分来决定基因的交换。使用算法增强功能对多个Twitter数据集进行了测试,并通过使用单点和多点交叉的两个标准GA实施进行了比较分析。实验结果表明,增强的遗传算法在减少所选特征的数量以及使用多项朴素贝叶斯算法提高分类精度方面具有优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号