首页> 外文会议>Genomic Signal Processing and Statistics (GENSIPS), 2011 IEEE International Workshop on >Fast and parallelized greedy forward selection of genetic variants in Genome-wide association studies
【24h】

Fast and parallelized greedy forward selection of genetic variants in Genome-wide association studies

机译:全基因组关联研究中基因变异的快速并行贪婪正向选择

获取原文
获取原文并翻译 | 示例

摘要

We present the application of a regularized least-squares based algorithm, known as greedy RLS, to perform a wrapper-based feature selection on an entire genome-wide association dataset. Wrapper methods were previously thought to be computationally infeasible on these types of studies. The running time of the method grows linearly in the number of training examples, the number of features in the original data set, and the number of selected features. Moreover, we show how it can be further accelerated using parallel computation on multi-core processors. We tested the method on the Wellcome Trust Case Control Consortium's (WTCCC) Type 2 Diabetes - UK National Blood Service dataset consisting of 3,382 subjects and 404,569 single nucleotide polymorphisms (SNPs). Our method is capable of high-speed feature selection, selecting the top 100 predictive SNPs in under five minutes on a high end desktop and outperforms typical filter approaches in terms of predictive performance.
机译:我们提出了一种基于规则最小二乘的算法,称为贪婪RLS,在整个基因组范围的关联数据集上执行基于包装的特征选择。以前认为包装方法在这些类型的研究中在计算上是不可行的。该方法的运行时间随着训练示例的数量,原始数据集中的特征数量以及所选特征的数量线性增加。此外,我们展示了如何在多核处理器上使用并行计算来进一步加速它。我们在惠康信托基金会病例对照协会(WTCCC)的2型糖尿病-英国国家血液服务数据集中测试了该方法,该数据集包含3,382名受试者和404,569个单核苷酸多态性(SNP)。我们的方法具有高速特征选择的能力,可以在五分钟内在高端台式机上选择前100个预测SNP,并且在预测性能方面优于典型的滤波器方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号