Study of text classification methods for data sets with huge features

机译：具有巨大功能的数据集文本分类方法研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Text classification has gained booming interest over the past few years. In this paper we look at the main approaches that have been taken towards text classification. The key text classification techniques including text model, feature selection methods and text classification algorithms are discussed. This work focus on the implementation of a text classification system based on Mutual Information and K-Nearest Neighbor algorithm and Support Vector Machine. The experimental results on Reuters collection are also presented. It shows that Mutual Information is a kind of efficient dimension reduction method for text data sets with huge features.

机译：文本分类在过去几年中获得了蓬勃发展的兴趣。在本文中，我们研究了对文本分类所采取的主要方法。讨论了包括文本模型，特征选择方法和文本分类算法的关键文本分类技术。这项工作侧重于基于互信息和k最近邻算法的文本分类系统的实现和支持向量机。还提出了路透社收集的实验结果。它表明，相互信息是具有巨大特征的文本数据集的一种有效的维度减少方法。

著录项

来源
《International Conference on Industrial and Information Systems》|2010年||共4页
会议地点
作者
Guiying Wei; Xuedong Gao; Sen Wu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 T-53;
关键词
Text classification; Feature selection; K-Nearest Neighbor; Mutual Information;

机译：文本分类;特征选择;k - 最近的邻居;共同信息;

相似文献

外文文献
中文文献
专利

1. Evaluation of feature selection methods for text classification with small datasets using multiple criteria decision-making methods [J] . Kou Gang, Yang Pei, Peng Yi, Applied Soft Computing . 2020,第期

机译：使用多种标准决策方法对小型数据集的文本分类特征选择方法的评估
2. Experimental identification of hard data sets for classification and feature selection methods with insights on method selection [J] . Cuiju Luan, Guozhu Dong Data & Knowledge Engineering . 2018,第NOVa期

机译：通过对方法选择的深入了解，对用于分类和特征选择方法的硬数据集进行实验识别
3. A COMPARATIVE STUDY OF COMBINED FEATURE SELECTION METHODS FOR ARABIC TEXT CLASSIFICATION [J] . Aisha Adel, Nazlia Omar, Adel Al-Shabi Journal of computer sciences . 2014,第11期

机译：阿拉伯文本分类的组合特征选择方法比较研究
4. Study of text classification methods for data sets with huge features [C] . Guiying Wei, Xuedong Gao, Sen Wu International Conference on Industrial and Information Systems . 2010

机译：具有巨大功能的数据集文本分类方法研究
5. Comparative Analysis of Feature Selection and Classification Methods for Epigenetic Methylation Data [D] . Kleyn, Aaron. 2021

机译：表观甲基化数据特征选择和分类方法的比较分析
6. Peculiar Genes Selection: A new features selection method to improve classification performances in imbalanced data sets [O] . Federica Martina, Marco Beccuti, Gianfranco Balbo, -1

机译：特殊基因选择：一种新的特征选择方法可改善不平衡数据集中的分类性能
7. Empirical Study on Filter based Feature Selection Methods for Text Classification [O] . Subhajit Dey Sarkar, Saptarsi Goswami, Asst Professor 2014

机译：基于过滤器的文本分类特征选择方法实证研究
8. Intelligent Classification in Huge Heterogeneous Data Sets. [R] . Prater, A. 2015

机译：巨型异构数据集的智能分类。

Study of text classification methods for data sets with huge features

摘要

著录项

相似文献

相关主题

期刊订阅