首页> 外文期刊>Artificial intelligence in medicine >The feature selection bias problem in relation to high-dimensional gene data
【24h】

The feature selection bias problem in relation to high-dimensional gene data

机译:与高维基因数据有关的特征选择偏差问题

获取原文
获取原文并翻译 | 示例
       

摘要

Objective: Feature selection is a technique widely used in data mining. The aim is to select the best subset of features relevant to the problem being considered. In this paper, we consider feature selection for the classification of gene datasets. Gene data is usually composed of just a few dozen objects described by thousands of features. For this kind of data, it is easy to find a model that fits the learning data. However, it is not easy to find one that will simultaneously evaluate new data equally well as learning data. This overfitting issue is well known as regards classification and regression, but it also applies to feature selection.
机译:目的:特征选择是一种广泛用于数据挖掘的技术。目的是选择与所考虑问题相关的最佳特征子集。在本文中,我们考虑将特征选择用于基因数据集的分类。基因数据通常仅由几十个以数千种功能描述的对象组成。对于此类数据,很容易找到适合学习数据的模型。但是,要找到一个可以同时评估新数据和学习数据的方法并不容易。这个过拟合问题在分类和回归方面是众所周知的,但它也适用于特征选择。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号