An empirical comparison of min-max-modular k-NN with different voting methods to large-scale text categorization

Wu K; Lu BL; Utiyama M; Isahara H

首页> 外文期刊>Soft computing: A fusion of foundations, methodologies and applications >An empirical comparison of min-max-modular k-NN with different voting methods to large-scale text categorization

【24h】

An empirical comparison of min-max-modular k-NN with different voting methods to large-scale text categorization

机译：最小-最大模量k-NN与不同投票方法对大规模文本分类的经验比较

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Text categorization refers to the task of assigning the pre-defined classes to text documents based on their content. k-NN algorithm is one of top performing classifiers on text data. However, there is little research work on the use of different voting methods over text data. Also, when a huge number of training data is available online, the response speed slows down, since a test document has to obtain the distance with each training data. On the other hand, min-max-modular k-NN (M-3-k-NN) has been applied to large-scale text categorization. M-3-k-NN achieves a good performance and has faster response speed in a parallel computing environment. In this paper, we investigate five different voting methods for k-NN and M-3-k-NN. The experimental results and analysis show that the Gaussian voting method can achieve the best performance among all voting methods for both k-NN and M-3-k-NN. In addition, M-3-k-NN uses less k-value to achieve the better performance than k-NN, and thus is faster than k-NN in a parallel computing environment.

机译：文本分类是指根据文本文档的内容为文本文档分配预定义类的任务。 k-NN算法是文本数据上性能最高的分类器之一。但是，很少有研究对文本数据使用不同的投票方法。另外，当在线提供大量训练数据时，由于测试文档必须获取每个训练数据的距离，因此响应速度会降低。另一方面，最小-最大模量k-NN（M-3-k-NN）已应用于大规模文本分类。 M-3-k-NN在并行计算环境中具有良好的性能并具有更快的响应速度。在本文中，我们研究了k-NN和M-3-k-NN的五种不同投票方法。实验结果和分析表明，对于k-NN和M-3-k-NN而言，高斯投票方法可以在所有投票方法中获得最佳性能。另外，M-3-k-NN使用较少的k值来获得比k-NN更好的性能，因此在并行计算环境中比k-NN更快。

著录项

来源
《Soft computing: A fusion of foundations, methodologies and applications》 |2008年第7期|共9页
作者
Wu K; Lu BL; Utiyama M; Isahara H;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算机软件;
关键词
text categorization; k-NN algorithm; min-max-modular k-NN; parallel computing; PATTERN-CLASSIFICATION; NEURAL-NETWORK; TASK DECOMPOSITION;

机译：文本分类;k-NN算法;最小-最大模量k-NN;并行计算;模式分类;神经网络;任务分解;

相似文献

外文文献
中文文献
专利

1. An empirical comparison of min-max-modular k-NN with different voting methods to large-scale text categorization [J] . Wu K, Lu BL, Utiyama M, Soft computing: A fusion of foundations, methodologies and applications . 2008,第7期

机译：最小-最大模量k-NN与不同投票方法对大规模文本分类的经验比较
2. A Comprehensive Empirical Comparison of Modern Supervised Classification and Feature Selection Methods for Text Categorization [J] . Yindalon Aphinyanaphongs, Lawrence D. Fu, Zhiguo Li, Journal of the American Society for Information Science and Technology . 2014,第10期

机译：现代监督分类和特征选择方法在文本分类中的综合经验比较
3. A comparison study on multiple binary-class SVM methods for unilabel text categorization [J] . M. Arun Kumar, rnM. Gopal Pattern recognition letters . 2010,第11期

机译：多种二元类SVM方法用于单标签文本分类的比较研究
4. An Empirical Comparison of Text Categorization Methods [C] . Ana Cardoso-Cachopo, Arlindo L. Oliveira String Processing and Information Retrieval . 2003

机译：文本分类方法的实证比较
5. An empirical study on hierarchical text categorization. [D] . Wang, Wei. 2008

机译：关于分层文本分类的实证研究。
6. A Comparison of the Spatial Linear Model to Nearest Neighbor (k-NN) Methods for Forestry Applications [O] . Jay M. Ver Hoef, Hailemariam Temesgen -1

机译：林业应用空间线性模型在最近邻的比较（K-NN）方法
7. An empirical comparison of text categorization methods [O] . Ana Cardoso-cachopo, Arlindo Limede Oliveira, Rua Alves Redol 2003

机译：文本分类方法的实证比较

An empirical comparison of min-max-modular k-NN with different voting methods to large-scale text categorization

摘要

著录项

相似文献

相关主题

期刊订阅