首页> 外文期刊>International Journal of Innovative Research in Science, Engineering and Technology >An Efficient Fast Clustering-Based Feature Subset Selection Algorithm for High- Dimensional Data
【24h】

An Efficient Fast Clustering-Based Feature Subset Selection Algorithm for High- Dimensional Data

机译:一种基于快速聚类的高维数据特征子集选择算法

获取原文
       

摘要

Feature selection is the process of identifying a subset of the most useful features that produces compatible results as the original entire set of features. A feature selection algorithm may be evaluated from both the efficiency and effectiveness points of view. While the efficiency concerns the time required to find a subset of features, the effectiveness is related to the quality of the subset of features. Based on these criteria, a Fast clustering-based feature Selection algorithm (FAST) is proposed and experimentally evaluated. The FAST algorithm works in two steps. In the first step, features are divided into clusters by using graph-theoretic clustering methods. In the second step, the most representative feature that is strongly related to target classes is selected from each cluster to form a subset of features. Features in different clusters are relatively independent; the clustering-based strategy of FAST has a high probability of producing a subset of useful and independent features. The Minimum-Spanning Tree (MST) using Prim’s algorithm can concentrate on one tree at a time. To ensure the efficiency of FAST, adopt the efficient MST using the Kruskal’s Algorithm clustering method.
机译:特征选择是识别最有用特征的子集的过程,该子集将产生兼容的结果作为原始的整个特征集。可以从效率和有效性的角度来评估特征选择算法。尽管效率与找到特征子集所需的时间有关,但有效性与特征子集的质量有关。基于这些标准,提出了一种基于快速聚类的特征选择算法(FAST)并进行了实验评估。 FAST算法分两个步骤工作。第一步,使用图论聚类方法将特征划分为聚类。在第二步中,从每个聚类中选择与目标类密切相关的最具代表性的要素,以形成要素的子集。不同集群中的要素相对独立; FAST基于聚类的策略很可能产生有用且独立的功能子集。使用Prim算法的最小生成树(MST)可以一次集中在一棵树上。为了确保FAST的效率,请使用Kruskal的算法聚类方法采用有效的MST。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号