An Examination of Feature Selection Frameworks in Text Categorization

机译：文本分类中特征选择框架的检验

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Feature selection, an important task in text categorization, is used for the purpose of dimensionality reduction. Feature selection basically can be performed locally and globally. For local selection, distinct feature sets are derived from different classes. The number of feature set is thus depended on the number of class. In contrary, only one universal feature set will be used in global feature selection. It is assumed that the feature set should preserve the characteristic of all classes. Furthermore, feature selection can also be carried out based on relevant feature set only (local dictionary) or both relevant and irrelevant feature set (universal dictionary). In this paper, we explored the different frameworks of feature selection to the task of text categorization on the Reuters(10) and Reuters(115) datasets (variants of Reuters-21578 corpus). We then investigate the efficiency of 7 different local or global feature selections corresponds the use of local and universal dictionary. Our experiments have shown that local feature selection with local dictionary yields optimal categorization results.

机译：特征选择是文本分类中的一项重要任务，用于降维。特征选择基本上可以在本地和全局执行。对于局部选择，不同的特征集来自不同的类。因此，特征集的数量取决于类别的数量。相反，在全局特征选择中将仅使用一个通用特征集。假定功能集应保留所有类的特征。此外，还可以仅基于相关特征集（本地字典）或相关和不相关特征集（通用字典）进行特征选择。在本文中，我们探索了针对Reuters（10）和Reuters（115）数据集（Reuters-21578语料库的变体）上的文本分类任务的特征选择的不同框架。然后，我们调查了7种不同的局部或全局特征选择的效率，这些选择对应于局部和通用字典的使用。我们的实验表明，使用局部字典进行局部特征选择可以产生最佳的分类结果。

著录项

来源
《Asia Information Retrieval Symposium(AIRS 2005); 20051013-15; Jeju Island(KR)》|2005年|P.558-564|共7页
会议地点 Jeju Island(KR)
作者
Bong Chih How; Wong Ting Kiong;
展开▼
作者单位

Faculty of Computer Science and Information Technology, 94300 Kota Samarahan, Sarawak, Malaysia;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类数据备份与恢复;
关键词
入库时间 2022-08-26 13:56:29

相似文献

外文文献
中文文献
专利

1. An alternative framework for univariate filter based feature selection for text categorization [J] . Guru D. S., Suhil Mahamad, Raju Lavanya Narayana, Pattern recognition letters . 2018,第FEBa1期

机译：用于文本分类的基于单变量过滤器特征选择的替代框架
2. Feature selection based on feature interactions with application to text categorization [J] . Tang Xiaochuan, Dai Yuanshun, Xiang Yanping Expert Systems with Application . 2019,第APRa期

机译：基于与应用到文本分类的特征交互的特征选择
3. Text Categorization Optimization By A Hybrid Approach Using Multiple Feature Selection And Feature Extraction Methods [J] . K. Rajeswari, Sneha Nakil, Neha Patil, International Journal of Engineering Research and Applications . 2014,第5期

机译：基于多种特征选择和特征提取的混合方法文本分类优化
4. A Framework of Feature Selection Methods for Text Categorization [C] . Shoushan Li, Rui Xia, Chengqing Zong, Joint conference of the annual meeting of the Association for Computational Linguistics;International joint conference on natural language processing of the Asian Federation of Natural Languages Processing;ACL 2009;IJCNLP 2009 . 2009

机译：文本分类的特征选择方法框架
5. An examination of KSS for feature selection for text categorization using support vector machines. [D] . Basu, Atreya. 2005

机译：使用支持向量机检查用于文本分类的特征选择的KSS。
6. Improved Feature-Selection Method Considering the Imbalance Problem in Text Categorization [O] . Jieming Yang, Zhaoyang Qu, Zhiying Liu -1

机译：文本分类中考虑不平衡问题的改进特征选择方法
7. A framework of feature selection methods for text categorization [O] . Li, S, Xia, RUI, Zong, C, 2009

机译：文本分类的特征选择方法框架

An Examination of Feature Selection Frameworks in Text Categorization

摘要

著录项

相似文献

相关主题

期刊订阅