A NEW DENSITY-BASED METHOD FOR REDUCING THE AMOUNT OF TRAINING DATA IN K-NN TEXT CLASSIFICATION

机译：一种新的基于密度的K-NN文本分类中减少训练数据量的方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the rapid development of WWW, text classification has become the key technology in organizing and processing large amount of text data.As a simple, effective and nonparametric classification method, k-NN method is widely used in text classification.But k-NN clasifier not only has large computational demands, but also may decrease the precision of classification because of uneven density of training data.In this paper, a new density-based method for reducing the amount of training data is presented, which not only reduces the computational demands of k-NN classifier, but also improves the classification precision.The experiments show that the new method has better performance than the traditional k-NN method.

机译：随着WWW的飞速发展，文本分类已成为组织和处理大量文本数据的关键技术.k-NN方法是一种简单，有效且非参数的分类方法，广泛用于文本分类。由于训练数据的密度不均匀，不仅计算量大，而且可能降低分类的精度。本文提出了一种新的基于密度的减少训练数据量的方法，不仅减少了计算量实验表明，该方法比传统的k-NN方法具有更好的性能。

著录项

来源
《Proceedings of the 2007 International Conference on Machine Learning and Cybernetics》|2007年|P.3372-3376|共5页
会议地点
作者
FANG YUAN; LIU YANG; GE YU;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
Text classification; K-Nearest Neighbor; Density; Training data;

机译：文本分类; K最近邻;密度;训练数据;

相似文献

外文文献
中文文献
专利

1. On the influence of training data quality on text document classification using machine learning methods [J] . Jyri Saarikoski, Henry Joutsijoki, Kalervo Jaervelin, International Journal of Knowledge Engineering and Data Mining . 2015,第2期

机译：训练数据质量对机器学习方法对文本文档分类的影响
2. A Two-Stage Methodology Using K-NN and False-Positive Minimizing ELM for Nominal Data Classification [J] . Anton Akusok, Yoan Miche, Jozsef Hegedus, Cognitive computation . 2014,第3期

机译：使用K-NN和虚假最小化ELM进行名义数据分类的两阶段方法
3. Evaluation of feature selection methods for text classification with small datasets using multiple criteria decision-making methods [J] . Kou Gang, Yang Pei, Peng Yi, Applied Soft Computing . 2020,第期

机译：使用多种标准决策方法对小型数据集的文本分类特征选择方法的评估
4. A NEW DENSITY-BASED METHOD FOR REDUCING THE AMOUNT OF TRAINING DATA IN K-NN TEXT CLASSIFICATION [C] . FANG YUAN, LIU YANG, GE YU International Conference on Machine Learning and Cybernetics . 2007

机译：一种新的基于密度的方法，用于减少k-nn文本分类中的训练数据量
5. Centroid-based dimension reduction methods for classification of high dimensional text data. [D] . Jeon, Moon-Gu. 2001

机译：基于质心的降维方法，用于对高维文本数据进行分类。
6. On Multilabel Classification Methods of Incompletely Labeled Biomedical Text Data [O] . Anton Kolesov, Dmitry Kamyshenkov, Maria Litovchenko, 2014

机译：不完全标记生物医学文本数据的多标签分类方法
7. Evaluation of normalization methods for cDNA microarray data by k-NN classification [O] . Wu Wei, Xing Eric P, Myers Connie, 2005

机译：通过k-NN分类评估cDNA微阵列数据的标准化方法

A NEW DENSITY-BASED METHOD FOR REDUCING THE AMOUNT OF TRAINING DATA IN K-NN TEXT CLASSIFICATION

摘要

著录项

相似文献

相关主题

期刊订阅