...
首页> 外文期刊>Cancer Informatics >A Jackknife and Voting Classifier Approach to Feature Selection and Classification
【24h】

A Jackknife and Voting Classifier Approach to Feature Selection and Classification

机译:特征选择和分类的折刀和投票分类器方法

获取原文

摘要

With technological advances now allowing measurement of thousands of genes, proteins and metabolites, researchers are using this information to develop diagnostic and prognostic tests and discern the biological pathways underlying diseases. Often, an investigator's objective is to develop a classification rule to predict group membership of unknown samples based on a small set of features and that could ultimately be used in a clinical setting. While common classification methods such as random forest and support vector machines are effective at separating groups, they do not directly translate into a clinically-applicable classification rule based on a small number of features.We present a simple feature selection and classification method for biomarker detection that is intuitively understandable and can be directly extended for application to a clinical setting. We first use a jackknife procedure to identify important features and then, for classification, we use voting classifiers which are simple and easy to implement. We compared our method to random forest and support vector machines using three benchmark cancer ‘omics datasets with different characteristics. We found our jackknife procedure and voting classifier to perform comparably to these two methods in terms of accuracy. Further, the jackknife procedure yielded stable feature sets. Voting classifiers in combination with a robust feature selection method such as our jackknife procedure offer an effective, simple and intuitive approach to feature selection and classification with a clear extension to clinical applications.
机译:随着技术的进步,现在可以测量成千上万的基因,蛋白质和代谢产物,研究人员正在利用这些信息来进行诊断和预后测试,并辨别疾病的生物学途径。通常,研究人员的目标是制定分类规则,以基于少量特征预测未知样品的组成员身份,并最终将其用于临床。尽管常见的分类方法(例如随机森林和支持向量机)在区分组方面很有效,但它们不能直接转换为基于少量特征的临床适用分类规则。我们提出了一种用于生物标记检测的简单特征选择和分类方法直观易懂,可以直接扩展到临床应用。我们首先使用折刀程序来识别重要特征,然后使用简单且易于实现的投票分类器进行分类。我们使用三个具有不同特征的基准癌症组学数​​据集,将我们的方法与随机森林和支持向量机进行了比较。我们发现折刀法和投票分类器在准确性方面可以与这两种方法相媲美。此外,折刀程序产生了稳定的特征集。投票分类器与健壮的特征选择方法(例如我们的折刀程序)相结合,为特征选择和分类提供了一种有效,简单且直观的方法,并且明确地扩展了临床应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号