Making the Most of Clumping and Thresholding for Polygenic Scores

机译：充分利用聚集和阈值进行多基因评分

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Polygenic prediction has the potential to contribute to precision medicine. Clumping and thresholding (C+T) is a widely used method to derive polygenic scores. When using C+T, several p value thresholds are tested to maximize predictive ability of the derived polygenic scores. Along with this p value threshold, we propose to tune three other hyper-parameters for C+T. We implement an efficient way to derive thousands of different C+T scores corresponding to a grid over four hyper-parameters. For example, it takes a few hours to derive 123K different C+T scores for 300K individuals and 1M variants using 16 physical cores. We find that optimizing over these four hyper-parameters improves the predictive performance of C+T in both simulations and real data applications as compared to tuning only the p value threshold. A particularly large increase can be noted when predicting depression status, from an AUC of 0.557 (95% CI: [0.544–0.569]) when tuning only the p value threshold to an AUC of 0.592 (95% CI: [0.580–0.604]) when tuning all four hyper-parameters we propose for C+T. We further propose stacked clumping and thresholding (SCT), a polygenic score that results from stacking all derived C+T scores. Instead of choosing one set of hyper-parameters that maximizes prediction in some training set, SCT learns an optimal linear combination of all C+T scores by using an efficient penalized regression. We apply SCT to eight different case-control diseases in the UK biobank data and find that SCT substantially improves prediction accuracy with an average AUC increase of 0.035 over standard C+T.

机译：多基因预测有可能为精密医学做出贡献。聚类和阈值化（C + T）是一种广泛用于获得多基因得分的方法。使用C + T时，将测试多个p值阈值，以最大程度地提高派生多基因得分的预测能力。连同此p值阈值，我们建议调整C + T的其他三个超参数。我们实现了一种有效的方法，可以得出与四个超参数上的网格相对应的数千个不同的C + T分数。例如，使用16个物理核心需要300,000个个体和1M变体来导出123K个不同的C + T分数需要花费几个小时。我们发现，与仅调整p值阈值相比，在这四个超参数上进行优化可以提高C + T在模拟和实际数据应用中的预测性能。当预测抑郁状态时，可以注意到特别大的增加，从仅将p值阈值调整到AUC为0.592（95％CI：[0.580–0.604]时的AUC为0.557（95％CI：[0.544–0.569]）。）在调整所有四个超参数时，我们建议使用C + T。我们进一步提出了堆积聚类和阈值化（SCT），这是一种通过堆叠所有导出的C + T分数而得到的多基因分数。 SCT通过使用有效的罚回归来学习所有C + T分数的最佳线性组合，而不是选择在某些训练集中最大化预测的一组超参数。我们将SCT应用于英国生物库数据中的八种不同的病例对照疾病，发现SCT大大提高了预测准确性，平均AUC较标准C + T增加了0.035。

著录项

期刊名称 American Journal of Human Genetics
作者
Florian Privé; Bjarni J. Vilhjálmsson; Hugues Aschard; Michael G.B. Blum;
展开▼
作者单位

展开▼
年(卷),期 2019(105),6
年度 2019
页码 -1
总页数 9
原文格式 PDF
正文语种
中图分类遗传学;
关键词
polygenic risk scores; PRS; clumping and thresholding; C+T; complex traits; UK Biobank; stacking;

机译：多基因风险评分;PRS;聚集和阈值;C + T;复杂性状;英国生物库;堆积;

相似文献

外文文献
中文文献
专利

1. Polygenic scores are an even bigger social hazard Commentary on: Baverstock, K. (2019) polygenic scores: Are they a public health hazard? Progress in Biophysics and Molecular Biology. Available online 6 August 2019 [J] . Progress in Biophysics and Molecular Biology: An International Review Journal . 2020,第期

机译：多基因分数是一个更大的社会危害评论：Baverstock，K。（2019）多基因分数：他们是公共卫生危害吗？生物物理学与分子生物学研究进展。在线提供2019年8月6日
2. COMPARING RESULTS OF POLYGENIC RISK SCORE AND POLYGENIC HAZARD SCORE IN PREDICTION OF AGE SPECIFIC RISK FOR DEVELOPING ALZHEIMER’S DISEASE [J] . Ganna Leonenko, Aura Frizzati, Rebecca Sims, Alzheimer’s & dementia: the journal of the Alzheimer’s Association . 2018,第7期

机译：比较多基因风险评分和多基因危害评分预测年龄特异性风险促进阿尔茨海默病的危险
3. Potential use of clinical polygenic risk scores in psychiatry – ethical implications and communicating high polygenic risk [J] . A. C. Palk, S. Dalvie, J. de Vries, Philosophy, Ethics, and Humanities in Medicine . 2019,第1期

机译：精神病学中临床多基因风险评分的潜在用途–伦理意义和交流高多基因风险
4. Feature Selection for Polygenic Risk Scores using Genetic Algorithm and Network Science [C] . Zhendong Sha, Ting Hu, Yuanzhu Chen IEEE Congress on Evolutionary Computation . 2021

机译：使用遗传算法和网络科学的多基因风险分数的特征选择
5. Evaluating the Utility of Multiple Trait Methods for Estimating Polygenic Risk Scores [D] . Fu, Jingyuan. 2020

机译：评估多种特征方法估算多基因风险分数的效用
6. Using an Alzheimer’s Disease polygenic risk score to predict memory decline in black and white Americans over 14 years of follow-up Running head: AD polygenic risk score predicting memory decline [O] . Jessica R. Marden, Elizabeth Rose Mayeda, Stefan Walter, -1

机译：使用阿尔茨海默氏病多基因风险评分来预测美国黑人和白人在过去14年的随访中的记忆力下降跑步头部：AD多基因风险评分来预测记忆力下降
7. Making the most of Clumping and Thresholding for polygenic scores [O] . Florian Privé, Bjarni J. Vilhjálmsson, Hugues Aschard, 2019

机译：充分利用多基因分数的丛集和阈值

Making the Most of Clumping and Thresholding for Polygenic Scores

摘要

著录项

相似文献

相关主题

期刊订阅