首页> 美国卫生研究院文献>other >Variable Selection in Heterogeneous Datasets: A Truncated-rank Sparse Linear Mixed Model with Applications to Genome-wide Association Studies
【2h】

Variable Selection in Heterogeneous Datasets: A Truncated-rank Sparse Linear Mixed Model with Applications to Genome-wide Association Studies

机译:异构数据集中的变量选择:截断秩稀疏线性混合模型及其在全基因组关联研究中的应用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

A fundamental and important challenge in modern datasets of ever increasing dimensionality is variable selection, which has taken on renewed interest recently due to the growth of biological and medical datasets with complex, non-i.i.d. structures. Naïvely applying classical variable selection methods such as the Lasso to such datasets may lead to a large number of false discoveries. Motivated by genome-wide association studies in genetics, we study the problem of variable selection for datasets arising from multiple subpopulations, when this underlying population structure is unknown to the researcher. We propose a unified framework for sparse variable selection that adaptively corrects for population structure via a low-rank linear mixed model. Most importantly, the proposed method does not require prior knowledge of individual relationships in the data and adaptively selects a covariance structure of the correct complexity. Through extensive experiments, we illustrate the effectiveness of this framework over existing methods. Further, we test our method on three different genomic datasets from plants, mice, and humans, and discuss the knowledge we discover with our model.
机译:在维数不断增加的现代数据集中,一个基本而重要的挑战是变量选择,由于具有复杂,非i.d.的生物学和医学数据集的增长,变量选择最近引起了人们的关注。结构。天真的将诸如套索之类的经典变量选择方法应用于此类数据集可能会导致大量错误的发现。受遗传学全基因组关联研究的启发,当研究人员不知道此基础种群结构时,我们研究了由多个亚人群产生的数据集的变量选择问题。我们提出了一个用于稀疏变量选择的统一框架,该框架通过低秩线性混合模型自适应地校正人口结构。最重要的是,提出的方法不需要先验数据中的个体关系,而是自适应地选择正确复杂度的协方差结构。通过广泛的实验,我们说明了该框架相对于现有方法的有效性。此外,我们在来自植物,小鼠和人类的三个不同基因组数据集上测试了我们的方法,并讨论了我们在模型中发现的知识。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号