首页> 外文期刊>Soft computing: A fusion of foundations, methodologies and applications >An empirical comparison of min-max-modular k-NN with different voting methods to large-scale text categorization
【24h】

An empirical comparison of min-max-modular k-NN with different voting methods to large-scale text categorization

机译:最小-最大模量k-NN与不同投票方法对大规模文本分类的经验比较

获取原文
获取原文并翻译 | 示例
           

摘要

Text categorization refers to the task of assigning the pre-defined classes to text documents based on their content. k-NN algorithm is one of top performing classifiers on text data. However, there is little research work on the use of different voting methods over text data. Also, when a huge number of training data is available online, the response speed slows down, since a test document has to obtain the distance with each training data. On the other hand, min-max-modular k-NN (M-3-k-NN) has been applied to large-scale text categorization. M-3-k-NN achieves a good performance and has faster response speed in a parallel computing environment. In this paper, we investigate five different voting methods for k-NN and M-3-k-NN. The experimental results and analysis show that the Gaussian voting method can achieve the best performance among all voting methods for both k-NN and M-3-k-NN. In addition, M-3-k-NN uses less k-value to achieve the better performance than k-NN, and thus is faster than k-NN in a parallel computing environment.
机译:文本分类是指根据文本文档的内容为文本文档分配预定义类的任务。 k-NN算法是文本数据上性能最高的分类器之一。但是,很少有研究对文本数据使用不同的投票方法。另外,当在线提供大量训练数据时,由于测试文档必须获取每个训练数据的距离,因此响应速度会降低。另一方面,最小-最大模量k-NN(M-3-k-NN)已应用于大规模文本分类。 M-3-k-NN在并行计算环境中具有良好的性能并具有更快的响应速度。在本文中,我们研究了k-NN和M-3-k-NN的五种不同投票方法。实验结果和分析表明,对于k-NN和M-3-k-NN而言,高斯投票方法可以在所有投票方法中获得最佳性能。另外,M-3-k-NN使用较少的k值来获得比k-NN更好的性能,因此在并行计算环境中比k-NN更快。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号