Using the two-population genetic algorithm with distance-based k-nearest neighbour voting classifier for high-dimensional data

Lee Chien-Pang; Lin Wen-Shin

首页> 外文期刊>International journal of data mining and bioinformatics >Using the two-population genetic algorithm with distance-based k-nearest neighbour voting classifier for high-dimensional data

【24h】

Using the two-population genetic algorithm with distance-based k-nearest neighbour voting classifier for high-dimensional data

机译：结合基于距离的k最近邻投票分类器的两种群遗传算法处理高维数据

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Owing to developments in computer technology, high-dimensional data has become a popular research issue. However, the traditional statistical methods cannot perform well when the variable numbers (p) are greater than the sample size (n). Accordingly, this paper proposes a novel hybrid model that combines statistical methodology with data mining techniques for the classification of high-dimensional data. In the proposed model, the Fisher's least significant difference test was originally used for initial dimension reduction. Subsequently, this paper uses a two-population genetic algorithms and a non-parametric statistics classification method (distance-based k-nearest neighbour voting classifier) to evaluate and to rank the variables' importance. Furthermore, the evaluation of the relevant variables for classification is considered with the outlier detection method. Eight different public gene expression datasets are used to compare the performance of the proposed model with the existing methods. The experimental results indicate that the proposed model performs better than the existing methods in terms of the classification accuracy.

机译：由于计算机技术的发展，高维数据已成为流行的研究问题。但是，当变量数（p）大于样本大小（n）时，传统的统计方法无法很好地执行。因此，本文提出了一种新颖的混合模型，该模型将统计方法与数据挖掘技术相结合，用于高维数据的分类。在提出的模型中，费舍尔最小显着性差异检验最初用于初始尺寸缩减。随后，本文使用两种群遗传算法和非参数统计分类方法（基于距离的k最近邻投票分类器）对变量的重要性进行评估和排名。此外，使用离群值检测方法考虑对相关变量进行评估以进行分类。八个不同的公共基因表达数据集用于比较所提出的模型与现有方法的性能。实验结果表明，该模型在分类精度上优于现有方法。

著录项

来源
《International journal of data mining and bioinformatics》 |2016年第4期|共17页
作者
Lee Chien-Pang; Lin Wen-Shin;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
genetic algorithm; k-nearest neighbour; Fisher's least significant difference; outlier detection; high-dimensional data; gene expression data;

机译：遗传算法;k近邻;费舍尔最小显着差异;离群值检测;高维数据;基因表达数据;

相似文献

外文文献
中文文献
专利

1. Using the two-population genetic algorithm with distance-based k-nearest neighbour voting classifier for high-dimensional data [J] . Lee Chien-Pang, Lin Wen-Shin International journal of data mining and bioinformatics . 2016,第4期

机译：结合基于距离的k最近邻投票分类器的两种群遗传算法处理高维数据
2. Using genetic algorithms and k-nearest neighbour for automatic frequency band selection for signal classification [J] . Rivero D., Guo L., Seoane J.A., Signal Processing, IET . 2012,第3期

机译：使用遗传算法和k近邻算法自动选择频带进行信号分类
3. Estimation of misclassification probability for a distance-based classifier in high-dimensional data [J] . Hiroki Watanabe, Masashi Hyodo, Yuki Yamada, Hiroshima mathematical journal . 2019,第2期

机译：高维数据中基于距离的分类器的误分类概率估计
4. Genetic programming and K-nearest neighbour classifier based intrusion detection model [C] . Shweta Malhotra, Vikram Bali, K. K. Paliwal Proceedings of the 7th International Conference Confluence 2017 on Cloud Computing, Data Science and Engineering . 2017

机译：基于遗传规划和K近邻分类器的入侵检测模型
5. Comparative classification of prostate cancer data using the Support Vector Machine, Random Forest, DualKS and k-Nearest Neighbours. [D] . Sakouvogui, Kekoura. 2015

机译：使用支持向量机，Random Forest，DualKS和k-Nearest邻居对前列腺癌数据进行比较分类。
6. Love Thy Neighbour: Automatic Animal Behavioural Classification of Acceleration Data Using the K-Nearest Neighbour Algorithm [O] . Owen R. Bidder, Hamish A. Campbell, Agustina Gómez-Laich, 2010

机译：爱你的邻居：使用K最近邻居算法对加速度数据进行自动动物行为分类
7. Efficient computation of k-Nearest Neighbour Graphs for large high-dimensional data sets on GPU clusters. [O] . Ali Dashti, Ivan Komarov, Roshan M D'Souza 2013

机译：高效计算GpU集群上大型高维数据集的k-最近邻图。

Using the two-population genetic algorithm with distance-based k-nearest neighbour voting classifier for high-dimensional data

摘要

著录项

相似文献

相关主题

期刊订阅