首页> 外文期刊>Mathematical Problems in Engineering: Theory, Methods and Applications >Hybrid Feature Selection for Amharic News Document Classification
【24h】

Hybrid Feature Selection for Amharic News Document Classification

机译:混合新闻文档分类的混合特征选择

获取原文
           

摘要

Today, the amount of Amharic digital documents has grown rapidly. Because of this, automatic text classification is extremely important. Proper selection of features has a crucial role in the accuracy of classification and computational time. When the initial feature set is considerably larger, it is important to pick the right features. In this paper, we present a hybrid feature selection method, called IGCHIDF, which consists of information gain (IG), chi-square (CHI), and document frequency (DF) features’ selection methods. We evaluate the proposed feature selection method on two datasets: dataset 1 containing 9 news categories and dataset 2 containing 13 news categories. Our experimental results showed that the proposed method performs better than other methods on both datasets 1and 2. The IGCHIDF method’s classification accuracy is up to 3.96% higher than the IG method, up to 11.16% higher than CHI, and 7.3% higher than DF on dataset 2, respectively.
机译:今天,Amharic数字文件的数量已迅速增长。 因此,自动文本分类非常重要。 正确选择特征在分类和计算时间的准确性中具有至关重要的作用。 当初始功能集相当大时,选择合适的功能非常重要。 在本文中,我们介绍了一个被称为IGChidf的混合特征选择方法,其包括信息增益(IG),Chi-Square(Chi)和文档频率(DF)特征选择方法。 我们评估两个数据集的建议功能选择方法:数据集1包含包含13个新闻类别的9个新闻类别和数据集2。 我们的实验结果表明,该方法比两种数据集的其他方法更好地表现出1和2的其他方法。IGChidf方法的分类精度高达IG方法的3.96%,高于CHI高出11.16%,比DF高7.3% 数据集2分别。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号