GU METRIC - A New Feature Selection Algorithm for Text Categorization

机译：古韵 - 文本分类的新特征选择算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

To improve scalability of text categorization and reduce over-fitting, it is desirable to reduce the number of words used for categorisiation. Further, it is desirable to achieve such a goal automatically without sacrificing the categorization accuracy. Such techniques are known as automatic feature selection methods. Typically this is done in the way that each word is assigned a weight (using a word scoring metric) and the top scoring words are then used to describe a document collection. There are several word scoring metrics which have been employed in literature. In this paper we present a novel feature selection method called the GU metric. The details of comparative evaluation of all the other methods are given. The results show that the GU metric outperforms some of the other well known feature selection methods.

机译：为了提高文本分类和减少过度拟合的可扩展性，希望减少用于分类的单词数。此外，希望自动实现这样的目标而不牺牲分类精度。这种技术称为自动特征选择方法。通常，这是以每个单词分配重量（使用字评分度量）的方式完成，然后使用顶部评分词来描述文档集合。文学中有几个词评分指标。在本文中，我们提出了一种名为GU度量的新颖特征选择方法。给出了所有其他方法的比较评价细节。结果表明，古度量优于其他一些众所周知的特征选择方法。

著录项

来源
《International Conference on Enterprise Information Systems》|2007年||共4页
会议地点
作者
Gulden Uchyigit; Keith Clark;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 F713.36-53;
关键词
Feature Selection; Text Categorization; Machine Learning;

机译：功能选择;文本分类;机器学习;
入库时间 2022-08-21 11:02:57

相似文献

外文文献
中文文献
专利

1. Combination of modified BPNN algorithms and an efficient feature selection method for text categorization [J] . Cheng Hua Li, Soon Cheol Park Information Processing & Management . 2009,第3期

机译：结合改进的BPNN算法和有效的特征选择方法进行文本分类
2. Comparison of term frequency and document frequency based feature selection metrics in text categorization [J] . Nouman Azam, JingTao Yao Expert Systems with Application . 2012,第5期

机译：术语分类中基于术语频率和文档频率的特征选择指标的比较
3. FIVE NEW FEATURE SELECTION METRICS IN TEXT CATEGORIZATION [J] . FENGXI SONG, DAVID ZHANG, YONG XU, International Journal of Pattern Recognition and Artificial Intelligence . 2007,第6期

机译：文本分类中的五个新功能选择指标
4. GU METRIC - A New Feature Selection Algorithm for Text Categorization [C] . Gulden Uchyigit, Keith Clark International Conference on Enterprise Information Systems . 2007

机译：古均衡 - 文本分类的新特征选择算法
5. Study of feature selection algorithms for text-categorization. [D] . Dave, Kandarp. 2011

机译：用于文本分类的特征选择算法的研究。
6. Improved Feature-Selection Method Considering the Imbalance Problem in Text Categorization [O] . Jieming Yang, Zhaoyang Qu, Zhiying Liu -1

机译：文本分类中考虑不平衡问题的改进特征选择方法
7. Study of feature selection algorithms for text-categorization [O] . Dave Kandarp 2011

机译：文本分类的特征选择算法研究

GU METRIC - A New Feature Selection Algorithm for Text Categorization

摘要

著录项

相似文献

相关主题

期刊订阅