首页> 外文期刊>BioData Mining >A feature selection method based on multiple kernel learning with expression profiles of different types
【24h】

A feature selection method based on multiple kernel learning with expression profiles of different types

机译:一种基于多核学习并具有不同类型表达谱的特征选择方法

获取原文
           

摘要

Background With the development of high-throughput technology, the researchers can acquire large number of expression data with different types from several public databases. Because most of these data have small number of samples and hundreds or thousands features, how to extract informative features from expression data effectively and robustly using feature selection technique is challenging and crucial. So far, a mass of many feature selection approaches have been proposed and applied to analyse expression data of different types. However, most of these methods only are limited to measure the performances on one single type of expression data by accuracy or error rate of classification. Results In this article, we propose a hybrid feature selection method based on Multiple Kernel Learning (MKL) and evaluate the performance on expression datasets of different types. Firstly, the relevance between features and classifying samples is measured by using the optimizing function of MKL. In this step, an iterative gradient descent process is used to perform the optimization both on the parameters of Support Vector Machine (SVM) and kernel confidence. Then, a set of relevant features is selected by sorting the optimizing function of each feature. Furthermore, we apply an embedded scheme of forward selection to detect the compact feature subsets from the relevant feature set. Conclusions We not only compare the classification accuracy with other methods, but also compare the stability, similarity and consistency of different algorithms. The proposed method has a satisfactory capability of feature selection for analysing expression datasets of different types using different performance measurements.
机译:背景技术随着高通量技术的发展,研究人员可以从几个公共数据库中获取大量不同类型的表达数据。由于这些数据大多数都具有少量的样本以及成百上千个特征,因此如何使用特征选择技术从表达数据中有效地,可靠地提取信息特征是具有挑战性和至关重要的。迄今为止,已经提出了大量的特征选择方法,并将其应用于分析不同类型的表达数据。但是,这些方法中的大多数仅限于通过分类的准确性或错误率来衡量一种类型的表达数据的性能。结果在本文中,我们提出了一种基于多核学习(MKL)的混合特征选择方法,并评估了不同类型的表达数据集的性能。首先,通过使用MKL的优化功能来测量特征与分类样本之间的相关性。在此步骤中,使用迭代梯度下降过程对支持向量机(SVM)和内核置信度进行参数优化。然后,通过对每个特征的优化功能进行排序来选择一组相关特征。此外,我们应用了一种前向选择的嵌入式方案,以从相关特征集中检测紧凑特征子集。结论我们不仅可以将分类精度与其他方法进行比较,而且还可以比较不同算法的稳定性,相似性和一致性。所提出的方法具有令人满意的特征选择能力,可以使用不同的性能度量来分析不同类型的表达数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号