Feature cluster selection for high-dimensional data analysis.

机译：用于高维数据分析的特征簇选择。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This thesis address the gaps between traditional data mining tasks, feature selection and clustering, and the knowledge desired by domain experts in real-world applications. It illustrates two particular gaps using microarray data analysis: the gap between a near-optimal feature subset and a candidate set of interesting features, and the gap between good clusters and relevant clusters. This thesis proposes to bridge such gaps by a new data mining task, feature cluster selection, which aims to select and group all relevant features in a data set into a small number of coherent clusters. It provides both formal definition and empirical formulation for the new problem, and describes an efficient solution based on Max-relevance, Max-cohesion, and Min-separation criteria. Experiments on microarray data verify that the solution can discover relevant feature clusters of statistical significance as well as select representative feature subsets of high accuracy.

机译：本文解决了传统数据挖掘任务，特征选择和聚类以及领域专家在实际应用中所需的知识之间的空白。它使用微阵列数据分析说明了两个特定的缺口：接近最佳特征子集和一组有趣特征的候选之间的缺口，以及良好聚类和相关聚类之间的缺口。本文提出了通过新的数据挖掘任务特征聚类选择来弥合这种差距，特征聚类选择旨在将数据集中的所有相关特征选择并分组为少量一致的聚类。它提供了新问题的正式定义和经验公式，并描述了基于最大相关性，最大内聚性和最小分离标准的有效解决方案。对微阵列数据的实验证明，该解决方案可以发现具有统计意义的相关特征簇，并可以选择具有代表性的高精度特征子集。

著录项

作者
Li, Hao.;
展开▼
作者单位

State University of New York at Binghamton.$bComputer Science.;

展开▼
授予单位 State University of New York at Binghamton.$bComputer Science.;
学科 Computer Science.
学位 M.S.
年度 2007
页码 49 p.
总页数 49
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. 计算高维代数簇的特征值方法 [J] . 冯果忱, 王柔怀东北数学：英文版 . 1996,第001期
2. Cluster Based Feature Subset Selection (CFSS) for High-Dimensional Data [J] . Sudhakar Ch, Swapna Priya Ch, Chandini Sasanapuri, International Journal of Applied Engineering Research . 2016,第2aPta7期

机译：高维数据的基于聚类的特征子集选择（CFSS）
3. Incomplete high-dimensional data imputation algorithm using feature selection and clustering analysis on cloud [J] . Bu Fanyu, Chen Zhikui, Zhang Qingchen, Journal of supercomputing . 2016,第8期

机译：基于特征选择和聚类分析的不完全高维数据插补算法
4. On online high-dimensional spherical data clustering and feature selection [J] . Ola Amayri, Nizar Bouguila Engineering Applications of Artificial Intelligence . 2013,第4期

机译：在线高维球形数据聚类与特征选择
5. A Density-Based Clustering Algorithm for High-Dimensional Data with Feature Selection [C] . Qi Xianting, Wang Pan International Conference on Industrial Informatics - Computing Technology, Intelligent Technology, Industrial Information Integration . 2016

机译：具有特征选择的基于密度的高维数据聚类算法
6. Novel Metrics and Theoretical Properties of Nearest-Neighbor Distance-Based Feature Selection in High-Dimensional Bioinformatics Data [D] . Dawkins, Bryan A. 2020

机译：高维生物信息学数据中最近邻距离的特征选择的新特性和理论特性
7. Comparison of Methods for Feature Selection in Clustering of High-Dimensional RNA-Sequencing Data to Identify Cancer Subtypes [O] . David Källberg, Linda Vidman, Patrik Rydén 2021

机译：高尺寸RNA测序数据聚类特征选择方法的比较识别癌症亚型
8. Comparison of Methods for Feature Selection in Clustering of High-Dimensional RNA-Sequencing Data to Identify Cancer Subtypes [O] . David Källberg, Linda Vidman, Patrik Rydén 2021

机译：高尺寸RNA测序数据聚类特征选择方法的比较识别癌症亚型
9. Feature Selection on Hyperspectral Data for Dismount Skin Analysis. [R] . Cain, L. R. 2014

机译：用于下行皮肤分析的高光谱数据特征选择。

Feature cluster selection for high-dimensional data analysis.

摘要

著录项

相似文献

相关主题

期刊订阅