首页> 外文会议>International Conference on Advances in Information and Communication Technology >k-Nearest Neighbour Using Ensemble Clustering Based on Feature Selection Approach to Learning Relational Data
【24h】

k-Nearest Neighbour Using Ensemble Clustering Based on Feature Selection Approach to Learning Relational Data

机译:基于专题选择方法的基于特征选择方法,使用集群基于学习关系数据

获取原文

摘要

Due to the growing amount of data generated and stored in relational databases, relational learning has attracted the interest of researchers in recent years. Many approaches have been developed in order to learn relational data. One of the approaches used to learn relational data is Dynamic Aggregation of Relational Attributes (DARA). The DARA algorithm is designed to summarize relational data with oneto-many relations. However, DARA suffers a major drawback when the cardinalities of attributes are very high because the size of the vector space representation depends on the number of unique values that exist for all attributes in the dataset. A feature selection process can be introduced to overcome this problem. These selected features can be further optimized to achieve a good classification result. Several clustering runs can be performed for different values of k to yield an ensemble of clustering results. This paper proposes a two-layered genetic algorithm-based feature selection in order to improve the classification performance of learning relational database using a k-NN ensemble classifier. The proposed method involves the task of omitting less relevant features but retaining the diversity of the classifiers so as to improve the performance of the k-NN ensemble. The result shows that the proposed k-NN ensemble is able to improve the performance of traditional k-NN classifiers.
机译:由于生成和存储在关系数据库中的数据越来越多,近年来的关系学习吸引了研究人员的兴趣。已经开发了许多方法,以便学习关系数据。用于学习关系数据的方法之一是关系属性的动态聚合(DARA)。 DARA算法旨在将关系数据与ONETO-许多关系汇总。然而,当属性的基数非常高时,Dara遭受了重大缺点,因为矢量空间表示的大小取决于数据集中所有属性的唯一值的数量。可以引入特征选择过程以克服此问题。可以进一步优化这些所选特征以实现良好的分类结果。可以对k的不同值执行几个聚类运行,以产生聚类结果的集合。本文提出了一种基于两层遗传算法的特征选择,以便使用K-NN集合分类器来提高学习关系数据库的分类性能。所提出的方法涉及省略较少相关特征但保留分类器的多样性的任务,以提高K-NN集合的性能。结果表明,所提出的K-NN集合能够改善传统K-NN分类器的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号