首页> 外文期刊>Computing and Information Systems >A Comparative Analysis of Feature Selection and Feature Extraction Models for Classifying Microarray Dataset
【24h】

A Comparative Analysis of Feature Selection and Feature Extraction Models for Classifying Microarray Dataset

机译:特征选择和特征提取模型对微阵列数据集分类的比较分析

获取原文
获取原文并翻译 | 示例
           

摘要

Purpose: The purpose of this research is to apply dimensionality reduction methods to fetch out the smallest set of genes that contributes to the efficient performance of classification algorithms in microarray data.Design/Methodology/Approach: Using colon cancer microarray dataset, One-Way- Analysis of Variance is used as a feature selection dimensionality reduction technique, due to its robustness and efficiency to select relevant information in a high-dimension of colon cancer microarray dataset. Principal Component Analysis (PCA) and Partial Least Square (PLS) are used as feature extraction techniques, by projecting the reduced high-dimensional data into efficient lowdimensional space. The classification capability of colon cancer datasets is carried out using a good classifier such as Support Vector Machine (SVM). The study is analyzed using MATLAB 2015. Findings: The study obtained high accuracies and the performances of the dimension reduction techniques used are compared. The PLS-Based attained 95% accuracy having edge over the other dimension reduction methods (One-Way- ANOVA and PCA).Practical Implications: The major implication of this research is getting the local dataset in the environments which lead to the usage of an open resource dataset.Originality: This study gives an insight and implications of high dimensional data in microarray gene analysis. The application of dimensionality reduction helps in fetching out irrelevant information that halts the performance of a microarray data technology.
机译:目的:本研究的目的是应用降维方法来提取最小的基因集,这些基因有助于微阵列数据中分类算法的高效执行。设计/方法/方法:使用结肠癌微阵列数据集,单向-方差分析由于其在高维结肠癌微阵列数据集中选择相关信息的鲁棒性和效率而被用作特征选择降维技术。通过将减少的高维数据投影到有效的低维空间中,将主成分分析(PCA)和偏最小二乘(PLS)用作特征提取技术。结肠癌数据集的分类能力是使用良好的分类器(例如支持向量机(SVM))进行的。使用MATLAB 2015对研究进行了分析。结果:研究获得了较高的准确性,并且比较了所使用的降维技术的性能。基于PLS的方法可达到95%的精度,与其他降维方法(单向ANOVA和PCA)相比,具有优势。实际意义:这项研究的主要意义是获取环境中的局部数据集,从而使用开放源数据集。来源:本研究对微阵列基因分析中的高维数据提供了见解和启示。降维的应用有助于获取不相关的信息,从而停止微阵列数据技术的性能。

著录项

  • 来源
    《Computing and Information Systems》 |2018年第2期|29-38|共10页
  • 作者单位

    Department of Computer Science, College of Information and Communication Technology, Kwara State University, Malete, Nigeria;

    Department of Computer Science, College of Information and Communication Technology, Kwara State University, Malete, Nigeria;

    Department of Computer Science, College of Information and Communication Technology, Kwara State University, Malete, Nigeria;

    Department of Computer Science, College of Information and Communication Technology, Kwara State University, Malete, Nigeria;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Dimension Reduction; One-Way-ANOVA; PCA; PLS; Classification;

    机译:降维;单向方差分析;PCA;PLS;分类;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号