Using a clustering similarity measure for feature selection in high dimensional data sets

机译：在高维数据集中使用聚类相似性度量进行特征选择

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Feature selection is a very important preprocessing step in data classification. By applying it we are able to reduce the dimensionality of the problem by removing redundant or irrelevant data. High dimensional data sets are becoming usual nowadays specially in bio-informatics, biology, signal processing or text classification, increasing the need for efficient feature selection methods. In this paper we study the applicability of a clustering validation measure, the Adjusted Rand Index (ARI), for this task comparing it with other methods based on statistical tests and on ROC curve. We have performed some experiments that show the validity of the proposed method.

机译：特征选择是数据分类中非常重要的预处理步骤。通过应用它，我们能够通过删除冗余或不相关的数据来减少问题的范围。如今，高维数据集正在变得越来越普遍，特别是在生物信息学，生物学，信号处理或文本分类中，从而增加了对有效特征选择方法的需求。在本文中，我们研究了聚类验证量度（调整后的兰德指数，ARI）的适用性，并将其与基于统计检验和ROC曲线的其他方法进行了比较。我们进行了一些实验，证明了该方法的有效性。

著录项

来源
《Proceedings of the 2010 10th International Conference on Intelligent Systems Design and Applications》|2010年|p.900-905|共6页
会议地点
作者
Santos Jorge M.; Ramos Sandra;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词
adjusted rand index; feature selection; high dimensional data sets;

机译：调整的兰德指数;特征选择;高维数据集;

相似文献

外文文献
中文文献
专利

1. A novel feature selection scheme for high-dimensional data sets: four-Staged Feature Selection [J] . Pehlivanli Ayca Cakmak Journal of applied statistics . 2016,第5a8期

机译：高维数据集的新颖特征选择方案：四阶段特征选择
2. A Cosine-Similarity Mutual-Information Approach for Feature Selection on High Dimensional Datasets [J] . Vimal Kumar Dubey, Amit Kumar Saxena Journal of information technology research . 2017,第1期

机译：高维数据集特征选择的余弦相似互信息方法
3. Multi View Cluster Approach to Explore Multi Objective Attributes based on Similarity Measure for High Dimensional Data [J] . Deena Babu Mandru, Y. K. Sundara Krishna International Journal of Applied Engineering Research . 2018,第15aPta5期

机译：基于高维数据的相似度量探索多目标属性的多视图群集方法
4. Using a clustering similarity measure for feature selection in high dimensional data sets [C] . Santos Jorge M., Ramos Sandra International Conference on Intelligent Systems Design and Applications . 2010

机译：使用群集相似度量在高维数据集中的特征选择
5. Feature cluster selection for high-dimensional data analysis. [D] . Li, Hao. 2007

机译：用于高维数据分析的特征簇选择。
6. ClusTrack: Feature Extraction and Similarity Measures for Clustering of Genome-Wide Data Sets [O] . Halfdan Rydbeck, Geir Kjetil Sandve, Egil Ferkingstad, -1

机译：ClusTrack：用于全基因组数据集聚类的特征提取和相似性度量
7. ClusTrack: feature extraction and similarity measures for clustering of genome-wide data sets. [O] . Halfdan Rydbeck, Geir Kjetil Sandve, Egil Ferkingstad, 2015

机译：ClusTrack：用于聚类全基因组数据集的特征提取和相似性度量。

Using a clustering similarity measure for feature selection in high dimensional data sets

摘要

著录项

相似文献

相关主题

期刊订阅