首页> 外文期刊>Interlending & document supply >Artificial bee colony algorithm for feature selection and improved support vector machine for text classification
【24h】

Artificial bee colony algorithm for feature selection and improved support vector machine for text classification

机译:人工蜂群特征选择算法和改进的支持向量机进行文本分类

获取原文
获取原文并翻译 | 示例
       

摘要

PurposeOwing to the huge volume of documents available on the internet, text classification becomes a necessary task to handle these documents. To achieve optimal text classification results, feature selection, an important stage, is used to curtail the dimensionality of text documents by choosing suitable features. The main purpose of this research work is to classify the personal computer documents based on their content.Design/methodology/approachThis paper proposes a new algorithm for feature selection based on artificial bee colony (ABCFS) to enhance the text classification accuracy. The proposed algorithm (ABCFS) is scrutinized with the real and benchmark data sets, which is contrary to the other existing feature selection approaches such as information gain and 2 statistic. To justify the efficiency of the proposed algorithm, the support vector machine (SVM) and improved SVM classifier are used in this paper.FindingsThe experiment was conducted on real and benchmark data sets. The real data set was collected in the form of documents that were stored in the personal computer, and the benchmark data set was collected from Reuters and 20 Newsgroups corpus. The results prove the performance of the proposed feature selection algorithm by enhancing the text document classification accuracy.Originality/valueThis paper proposes a new ABCFS algorithm for feature selection, evaluates the efficiency of the ABCFS algorithm and improves the support vector machine. In this paper, the ABCFS algorithm is used to select the features from text (unstructured) documents. Although, there is no text feature selection algorithm in the existing work, the ABCFS algorithm is used to select the data (structured) features. The proposed algorithm will classify the documents automatically based on their content.
机译:目的由于互联网上可用的大量文档,文本分类成为处理这些文档的必要任务。为了获得最佳的文本分类结果,重要的功能选择是通过选择合适的功能来减少文本文档的维数。本研究工作的主要目的是基于个人计算机文档的内容进行分类。设计/方法/方法本文提出了一种基于人工蜂群(ABCFS)的特征选择新算法,以提高文本分类的准确性。所提出的算法(ABCFS)是用真实数据集和基准数据集进行审查的,这与其他现有的特征选择方法(例如信息增益和2统计量)相反。为了证明该算法的有效性,本文使用了支持向量机和改进的支持向量机分类器。研究结果对真实数据集和基准数据集进行了实验。真实数据集以存储在个人计算机中的文档形式收集,基准数据集则来自路透社和20个新闻组语料库。通过提高文本文档的分类精度,结果证明了所提特征选择算法的性能。来源/价值本文提出了一种新的ABCFS特征选择算法,评估了ABCFS算法的效率,改进了支持向量机。在本文中,使用ABCFS算法从文本(非结构化)文档中选择特征。尽管在现有工作中没有文本特征选择算法,但是ABCFS算法用于选择数据(结构化)特征。所提出的算法将基于文档的内容自动对文档进行分类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号