...
首页> 外文期刊>Data mining and knowledge discovery >Differential prioritization in feature selection and classifier aggregation for multiclass microarray datasets
【24h】

Differential prioritization in feature selection and classifier aggregation for multiclass microarray datasets

机译:多类微阵列数据集的特征选择和分类器聚合中的区分优先级

获取原文
获取原文并翻译 | 示例
           

摘要

The high dimensionality of microarray datasets endows the task of multiclass tissue classification with various difficulties-the main challenge being the selection of features deemed relevant and non-redundant to form the predictor set for classifier training. The necessity of varying the emphases on relevance and redundancy, through the use of the degree of differential prioritization (DDP) during the search for the predictor set is also of no small importance. Furthermore, there are several types of decomposition technique for the feature selection (FS) problem-all-classes-at-once, one-vs.-all (OVA) or pairwise (PW). Also, in multiclass problems, there is the need to consider the type of classifier aggregation used-whether non-aggregated (a single machine), or aggregated (OVA or PW). From here, first we propose a systematic approach to combining the distinct problems of FS and classification. Then, using eight well-known multiclass microarray datasets, we empirically demonstrate the effectiveness of the DDP in various combinations of FS decomposition types and classifier aggregation methods. Aided by the variable DDP, feature selection leads to classification performance which is better than that of rank-based or equal-priorities scoring methods and accuracies higher than previously reported for benchmark datasets with large number of classes. Finally, based on several criteria, we make general recommendations on the optimal choice of the combination of FS decomposition type and classifier aggregation method for multiclass microarray datasets.
机译:微阵列数据集的高维性赋予多类组织分类任务带来各种困难-主要挑战是选择被认为相关且非冗余的特征以形成用于分类器训练的预测器集。在预测变量集的搜索过程中,通过使用差异优先级(DDP)来改变相关性和冗余的重点的必要性也同样重要。此外,对于特征选择(FS)一次所有类,一对所有(OVA)或成对(PW)的问题,存在几种分解技术。同样,在多类问题中,需要考虑使用的分类器聚合的类型-非聚合(单台计算机)还是聚合(OVA或PW)。从这里开始,我们首先提出一种系统的方法来结合FS和分类的独特问题。然后,使用八个著名的多类微阵列数据集,我们通过经验证明了DDP在FS分解类型和分类器聚合方法的各种组合中的有效性。在变量DDP的帮助下,特征选择导致的分类性能优于基于等级或同等优先级的评分方法,其准确性高于以前针对具有大量类的基准数据集所报告的准确性。最后,基于几个标准,我们对多类微阵列数据集的FS分解类型和分类器聚合方法的组合的最佳选择提出了一般性建议。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号