Differential prioritization in feature selection and classifier aggregation for multiclass microarray datasets

Ooi CH; Chetty M; Teng SW

首页> 外文期刊>Data mining and knowledge discovery >Differential prioritization in feature selection and classifier aggregation for multiclass microarray datasets

【24h】

Differential prioritization in feature selection and classifier aggregation for multiclass microarray datasets

机译：多类微阵列数据集的特征选择和分类器聚合中的区分优先级

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The high dimensionality of microarray datasets endows the task of multiclass tissue classification with various difficulties-the main challenge being the selection of features deemed relevant and non-redundant to form the predictor set for classifier training. The necessity of varying the emphases on relevance and redundancy, through the use of the degree of differential prioritization (DDP) during the search for the predictor set is also of no small importance. Furthermore, there are several types of decomposition technique for the feature selection (FS) problem-all-classes-at-once, one-vs.-all (OVA) or pairwise (PW). Also, in multiclass problems, there is the need to consider the type of classifier aggregation used-whether non-aggregated (a single machine), or aggregated (OVA or PW). From here, first we propose a systematic approach to combining the distinct problems of FS and classification. Then, using eight well-known multiclass microarray datasets, we empirically demonstrate the effectiveness of the DDP in various combinations of FS decomposition types and classifier aggregation methods. Aided by the variable DDP, feature selection leads to classification performance which is better than that of rank-based or equal-priorities scoring methods and accuracies higher than previously reported for benchmark datasets with large number of classes. Finally, based on several criteria, we make general recommendations on the optimal choice of the combination of FS decomposition type and classifier aggregation method for multiclass microarray datasets.

机译：微阵列数据集的高维性赋予多类组织分类任务带来各种困难-主要挑战是选择被认为相关且非冗余的特征以形成用于分类器训练的预测器集。在预测变量集的搜索过程中，通过使用差异优先级（DDP）来改变相关性和冗余的重点的必要性也同样重要。此外，对于特征选择（FS）一次所有类，一对所有（OVA）或成对（PW）的问题，存在几种分解技术。同样，在多类问题中，需要考虑使用的分类器聚合的类型-非聚合（单台计算机）还是聚合（OVA或PW）。从这里开始，我们首先提出一种系统的方法来结合FS和分类的独特问题。然后，使用八个著名的多类微阵列数据集，我们通过经验证明了DDP在FS分解类型和分类器聚合方法的各种组合中的有效性。在变量DDP的帮助下，特征选择导致的分类性能优于基于等级或同等优先级的评分方法，其准确性高于以前针对具有大量类的基准数据集所报告的准确性。最后，基于几个标准，我们对多类微阵列数据集的FS分解类型和分类器聚合方法的组合的最佳选择提出了一般性建议。

著录项

来源
《Data mining and knowledge discovery》 |2007年第3期|共38页
作者
Ooi CH; Chetty M; Teng SW;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
tissue classification; microarray data analysis; multiclass classification; feature selection; classifier aggregation; GENE-EXPRESSION DATA; DNA MICROARRAY; CANCER; PREDICTION; ADENOCARCINOMA; EXTRACTION; DISCOVERY; PATTERNS; LEUKEMIA; LUNG;

机译：组织分类;微阵列数据分析;多分类;特征选择;分类器聚合;基因表达数据;DNA微阵列;癌症;预测;腺癌;提取;发现;模式;白血病;肺;

相似文献

外文文献
中文文献
专利

1. Differential prioritization in feature selection and classifier aggregation for multiclass microarray datasets [J] . Chia Huey Ooi, Madhu Chetty, Shyh Wei Teng Data Mining and Knowledge Discovery . 2007,第3期

机译：多类微阵列数据集的特征选择和分类器聚合中的区分优先级
2. A Comparative Analysis of Feature Selection and Feature Extraction Models for Classifying Microarray Dataset [J] . Arowolo M. Olaolu, Sulaiman O. Abdulsalam, Isiaka R. Mope, Computing and Information Systems . 2018,第2期

机译：特征选择和特征提取模型对微阵列数据集分类的比较分析
3. Differential prioritization between relevance and redundancy in correlation-based feature selection techniques for multiclass gene expression data [J] . Chia Huey Ooi, Madhu Chetty, Shyh Wei Teng BMC Bioinformatics . 2006,第1期

机译：基于相关性的多类基因表达数据特征选择技术中相关性和冗余之间的区分优先级
4. A GA-Based Approach to ICA Feature Selection: An Efficient Method to Classify Microarray Datasets [C] . Kun-Hong Liu, Jun Zhang, Bo Li, International symposium on neural networks;ISNN 2009 . 2009

机译：基于遗传算法的ICA特征选择方法：一种对微阵列数据集进行分类的有效方法
5. Parallel Feature Selection of Multiple Class Datasets Using Apache Spark [D] . Sankineni, Rishi 2017

机译：使用Apache Spark的多个类数据集的并行特征选择
6. Differential prioritization between relevance and redundancy in correlation-based feature selection techniques for multiclass gene expression data [O] . Chia Huey Ooi, Madhu Chetty, Shyh Wei Teng 2006

机译：基于相关性的多类基因表达数据特征选择技术中相关性和冗余之间的差异优先级
7. Differential prioritization between relevance and redundancy in correlation-based feature selection techniques for multiclass gene expression data [O] . Chetty Madhu, Ooi Chia, Teng Shyh 2006

机译：基于相关性的多类基因表达数据特征选择技术中相关性和冗余之间的差异优先级

Differential prioritization in feature selection and classifier aggregation for multiclass microarray datasets

摘要

著录项

相似文献

相关主题

期刊订阅