...
首页> 外文期刊>Bioinformatics >Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method.
【24h】

Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method.

机译:基于基因表达数据的样本分类基因选择:研究对GA / KNN方法参数选择的敏感性。

获取原文
获取原文并翻译 | 示例
           

摘要

MOTIVATION: We recently introduced a multivariate approach that selects a subset of predictive genes jointly for sample classification based on expression data. We tested the algorithm on colon and leukemia data sets. As an extension to our earlier work, we systematically examine the sensitivity, reproducibility and stability of gene selection/sample classification to the choice of parameters of the algorithm. METHODS: Our approach combines a Genetic Algorithm (GA) and the k-Nearest Neighbor (KNN) method to identify genes that can jointly discriminate between different classes of samples (e.g. normal versus tumor). The GA/KNN method is a stochastic supervised pattern recognition method. The genes identified are subsequently used to classify independent test set samples. RESULTS: The GA/KNN method is capable of selecting a subset of predictive genes from a large noisy data set for sample classification. It is a multivariate approach that can capture the correlated structure in the data. We find that for a given data set gene selection is highly repeatable in independent runs using the GA/KNN method. In general, however, gene selection may be less robust than classification. AVAILABILITY: The method is available at http://dir.niehs.nih.gov/microarray/datamining CONTACT: LI3
机译:动机:我们最近引入了一种多变量方法,该方法根据表达数据共同选择一个预测基因的子集进行样品分类。我们在结肠和白血病数据集上测试了该算法。作为我们早期工作的扩展,我们系统地检查了基因选择/样本分类对算法参数选择的敏感性,可重复性和稳定性。方法:我们的方法结合了遗传算法(GA)和k最近邻(KNN)方法来识别可以共同区分不同类别样本(例如正常样本与肿瘤样本)的基因。 GA / KNN方法是一种随机监督模式识别方法。鉴定出的基因随后用于对独立的测试集样本进行分类。结果:GA / KNN方法能够从大量嘈杂的数据集中选择预测基因的子集进行样品分类。它是一种多变量方法,可以捕获数据中的相关结构。我们发现,对于给定的数据集,使用GA / KNN方法在独立运行中基因选择具有高度可重复性。但是,一般而言,基因选择可能不如分类可靠。可用性:该方法可从http://dir.niehs.nih.gov/microarray/datamining获得。联系人:LI3

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号