Comparing Feature Selection Methods by Using Rank Aggregation

机译：使用等级汇总比较特征选择方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Feature selection (FS) is becoming critical in this data era. Selecting effective features from datasets is a particularly important part in text classification, data mining, pattern recognition and artificial intelligence. FS excludes irrelevant features from the classification task, reduces the dimensionality of a dataset, allows us to better understand data, improves the performance of machine learning techniques, and minimizes the computation requirement. Thus far, a large number of FS methods have been proposed, however the most effective one in practice remains unclear. Though it is conceivable that different categories of FS methods have different evaluation criteria for variables, there are few studies fixating on evaluating various categories of FS methods. This article gathers ten superior FS methods under four different categories, and fixates on evaluating and comparing them in general versatility (constant ability to select out the useful features) regarding authorship attribution problems. Besides, this article tries to identify which method is most effective. SVM (support vector machine) serves as the classifier. Different categories of features, different numbers of top variables in feature rankings, and different performance measures are employed to measure the effectiveness and general versatility of these methods together. Finally, rank aggregation method Schulze (SSD) is employed to make a ranking of the ten FS methods. The analysis results suggest that Mahalanobis distance is the best method on the whole.

机译：在这个数据时代，功能选择（FS）变得至关重要。从数据集中选择有效特征是文本分类，数据挖掘，模式识别和人工智能中特别重要的部分。 FS从分类任务中排除了无关的功能，降低了数据集的维数，使我们能够更好地理解数据，提高了机器学习技术的性能，并最大程度地减少了计算需求。迄今为止，已经提出了大量的FS方法，但是在实践中最有效的方法仍然不清楚。尽管可以想象不同类别的FS方法对变量的评估标准不同，但是很少有研究致力于评估各种类别的FS方法。本文收集了四个不同类别下的十种高级FS方法，并着眼于评估和比较它们在作者归属问题上的通用性（恒定地选择有用的功能）。此外，本文试图确定哪种方法最有效。 SVM（支持向量机）用作分类器。使用不同类别的特征，不同数量的特征变量中的顶级变量以及不同的性能指标来一起衡量这些方法的有效性和通用性。最后，采用排名聚合方法舒尔茨（SSD）对十个FS方法进行排名。分析结果表明，从整体上来说，马氏距离是最好的方法。

著录项

来源
《International Conference on ICT and Knowledge Engineering》|2018年|1-6|共6页
会议地点
作者
Wanwan Zheng; Mingzhe Jin;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Feature extraction; Support vector machines; Text categorization; Principal component analysis; Machine learning; Cats; Classification algorithms;

机译：特征提取;支持向量机;文本分类;主成分分析;机器学习;猫;分类算法;

相似文献

外文文献
中文文献
专利

1. A fuzzy gaussian rank aggregation ensemble feature selection method for microarray data [J] . Venkatesh B., Anuradha J. International journal of knowledge-based and intelligent engineering systems . 2020,第4期

机译：用于微阵列数据的模糊高斯级别聚合集合功能选择方法
2. A feature selection model based on genetic rank aggregation for text sentiment classification [J] . Aytug Onan, Serdar Korukoglu Journal of Information Science . 2017,第1期

机译：基于遗传等级聚合的文本情感分类特征选择模型
3. ROBUST FEATURE SELECTION TECHNIQUE USING RANK AGGREGATION [J] . Chandrima Sarkar, Sarah Cooley, Jaideep Srivastava Applied Artificial Intelligence . 2014,第1a3期

机译：基于秩聚合的鲁棒特征选择技术
4. Comparing Feature Selection Methods by Using Rank Aggregation [C] . Wanwan Zheng, Mingzhe Jin International Conference on ICT and Knowledge Engineering . 2018

机译：使用Rank聚合进行比较特征选择方法
5. Rank Aggregation Methods For Consensus Ranking in Multilayer Networks [D] . Braun, Niklas 2019

机译：多层网络中共识排序的排序聚合方法
6. Robust Feature Selection Technique using Rank Aggregation [O] . Chandrima Sarkar, Sarah Cooley, Jaideep Srivastava -1

机译：使用等级汇总的稳健特征选择技术
7. Empirical Analysis of Rank Aggregation-Based Multi-Filter Feature Selection Methods in Software Defect Prediction [O] . Abdullateef O. Balogun, Shuib Basri, Saipunidzam Mahamad, 2021

机译：软件缺陷预测中基于秩聚合的多滤波器特征选择方法的实证分析

Comparing Feature Selection Methods by Using Rank Aggregation

摘要

著录项

相似文献

相关主题

期刊订阅