A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data of Glioma

Heba Abusamra

首页> 外文期刊>Procedia Computer Science >A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data of Glioma

【24h】

A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data of Glioma

机译：脑胶质瘤基因表达数据特征选择和分类方法的比较研究

获取原文

开具论文收录证明 >>

AI期刊论文写作 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Microarray gene expression data gained great importance in recent years due to its role in disease diagnoses and prognoses which help to choose the appropriate treatment plan for patients. This technology has shifted a new era in molecular classification. Interpreting gene expression data remains a difficult problem and an active research area due to their native nature of “high dimensional low sample size”. Such problems pose great challenges to existing classification methods. Thus, effective feature selection techniques are often needed in this case to aid to correctly classify different tumor types and consequently lead to a better understanding of genetic signatures as well as improve treatment strategies. This paper aims on a comparative study of state-of-the- art feature selection methods, classification methods, and the combination of them, based on gene expression data. We compared the efficiency of three different classification methods including: support vector machines, k-nearest neighbor and random forest, and eight different feature selection methods, including: information gain, twoing rule, sum minority, max minority, gini index, sum of variances, t-statistics, and one-dimension support vector machine. Five-fold cross validation was used to evaluate the classification performance. Two publicly available gene expression data sets of glioma were used in the experiments. Results revealed the important role of feature selection in classifying gene expression data. By performing feature selection, the classification accuracy can be significantly boosted by using a small number of genes. The relationship of features selected in different feature selection methods is investigated and the most frequent features selected in each fold among all methods for both datasets are evaluated.

机译：近年来，由于微阵列基因表达数据在疾病诊断和预后中的作用，这有助于为患者选择合适的治疗方案，因此微阵列基因表达数据变得非常重要。这项技术已经改变了分子分类的新纪元。解释基因表达数据仍然是一个难题，也是一个活跃的研究领域，这归因于其“高维低样本量”的本性。这些问题对现有的分类方法提出了巨大的挑战。因此，在这种情况下，通常需要有效的特征选择技术来帮助正确分类不同的肿瘤类型，并因此导致对遗传特征的更好理解并改善治疗策略。本文旨在基于基因表达数据，对最先进的特征选择方法，分类方法及其组合进行比较研究。我们比较了三种不同分类方法的效率，这些方法包括：支持向量机，k近邻和随机森林，以及八种不同的特征选择方法，包括：信息增益，二分法则，总和，最大少数，基尼系数，方差和，t统计量和一维支持向量机。五重交叉验证用于评估分类性能。实验中使用了两个可公开获得的神经胶质瘤基因表达数据集。结果揭示了特征选择在基因表达数据分类中的重要作用。通过执行特征选择，可以通过使用少量基因显着提高分类精度。研究了在不同特征选择方法中选择的特征之间的关系，并评估了两个数据集所有方法中每个折叠中选择的最频繁特征。

著录项

来源
《Procedia Computer Science》 |2013年第1期|共10页
作者
Heba Abusamra;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
gene expressionmicroarray datafeature selectionclassificationglioma;

机译：基因表达芯片数据特征选择分类胶质瘤;

相似文献

外文文献
中文文献
专利

1. A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression [J] . Li T, Zhang CL, Ogihara M Bioinformatics . 2004,第15期

机译：基于基因表达的组织分类的特征选择和多分类方法比较研究
2. A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression [J] . Li T, Zhang CL, Ogihara M Bioinformatics . 2004,第15期

机译：基于基因表达的组织分类特征选择和多分类方法比较研究
3. Microarray gene-expression data classification using less gene expressions by combining feature selection methods and classifiers [J] . Aarti Bhalla, R. K. Agrawal International Journal of Information Engineering and Electronic Business . 2013,第5期

机译：结合特征选择方法和分类器，使用较少的基因表达进行微阵列基因表达数据分类
4. A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns. [C] . Liu H, Li J, Wong L Workshop on Genome Informatics . 2002

机译：基因表达谱和蛋白质组学模式的特征选择和分类方法的比较研究。
5. Comparative Analysis of Feature Selection and Classification Methods for Epigenetic Methylation Data [D] . Kleyn, Aaron. 2021

机译：表观甲基化数据特征选择和分类方法的比较分析
6. sigFeature: Novel Significant Feature Selection Method for Classification of Gene Expression Data Using Support Vector Machine and t Statistic [O] . Pijush Das, Anirban Roychowdhury, Subhadeep Das, 2020

机译：sigFeature：使用支持向量机和t统计量对基因表达数据进行分类的重要特征选择方法
7. A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data of Glioma [O] . Abusamra Heba 2013

机译：脑胶质瘤基因表达数据特征选择和分类方法的比较研究

A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data of Glioma

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅