首页> 外文期刊>ACM transactions on Asian and low-resource language information processing >A Two-stage Text Feature Selection Algorithm for Improving Text Classification
【24h】

A Two-stage Text Feature Selection Algorithm for Improving Text Classification

机译:改进文本分类的两级文本特征选择算法

获取原文
获取原文并翻译 | 示例

摘要

As the number of digital text documents increases on a daily basis, the classification of text is becoming a challenging task. Each text document consists of a large number of words (or features) that drive down the efficiency of a classification algorithm. This article presents an optimized feature selection algorithm designed to reduce a large number of features to improve the accuracy of the text classification algorithm. The proposed algorithm uses noun-based filtering, a word ranking that enhances the performance of the text classification algorithm. Experiments are carried out on three benchmark datasets, and the results show that the proposed classification algorithm has achieved the maximum accuracy when compared to the existing algorithms. The proposed algorithm is compared to Term Frequency-Inverse Document Frequency, Balanced Accuracy Measure, GINI Index, Information Gain, and Chi-Square. The experimental results clearly show the strength of the proposed algorithm.
机译:随着数字文本的数量每天增加,文本的分类正成为一个具有挑战性的任务。 每个文本文档包括大量的单词(或功能),从而降低分类算法的效率。 本文介绍了一个优化的特征选择算法,旨在减少大量功能以提高文本分类算法的准确性。 所提出的算法使用基于名词的滤波,一个单词排名,增强了文本分类算法的性能。 实验在三个基准数据集中执行,结果表明,与现有算法相比,所提出的分类算法已经实现了最大精度。 将所提出的算法与术语频率 - 逆文档频率,平衡精度测量,GINI指数,信息增益和Chi-Square进行比较。 实验结果清楚地显示了所提出的算法的强度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号