首页> 美国卫生研究院文献>Bioinformatics >High-dimensional pharmacogenetic prediction of a continuous trait using machine learning techniques with application to warfarin dose prediction in African Americans
【2h】

High-dimensional pharmacogenetic prediction of a continuous trait using machine learning techniques with application to warfarin dose prediction in African Americans

机译:使用机器学习技术对连续性状进行高维药物遗传学预测并将其应用于华裔素在非洲裔美国人中的剂量预测

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Motivation: With complex traits and diseases having potential genetic contributions of thousands of genetic factors, and with current genotyping arrays consisting of millions of single nucleotide polymorphisms (SNPs), powerful high-dimensional statistical techniques are needed to comprehensively model the genetic variance. Machine learning techniques have many advantages including lack of parametric assumptions, and high power and flexibility.>Results: We have applied three machine learning approaches: Random Forest Regression (RFR), Boosted Regression Tree (BRT) and Support Vector Regression (SVR) to the prediction of warfarin maintenance dose in a cohort of African Americans. We have developed a multi-step approach that selects SNPs, builds prediction models with different subsets of selected SNPs along with known associated genetic and environmental variables and tests the discovered models in a cross-validation framework. Preliminary results indicate that our modeling approach gives much higher accuracy than previous models for warfarin dose prediction. A model size of 200 SNPs (in addition to the known genetic and environmental variables) gives the best accuracy. The R2 between the predicted and actual square root of warfarin dose in this model was on average 66.4% for RFR, 57.8% for SVR and 56.9% for BRT. Thus RFR had the best accuracy, but all three techniques achieved better performance than the current published R2 of 43% in a sample of mixed ethnicity, and 27% in an African American sample. In summary, machine learning approaches for high-dimensional pharmacogenetic prediction, and for prediction of clinical continuous traits of interest, hold great promise and warrant further research.>Contact: >Supplementary information: are available at Bioinformatics online.
机译:>动机:复杂的性状和疾病具有成千上万个遗传因素的潜在遗传贡献,并且当前的基因分型阵列由数百万个单核苷酸多态性(SNP)组成,因此需要强大的高维统计技术来全面模拟遗传变异。机器学习技术具有许多优点,包括缺乏参数假设,强大的功能和灵活性。>结果:我们已经应用了三种机器学习方法:随机森林回归(RFR),增强回归树(BRT)和支持用向量回归(SVR)预测一组非裔美国人中的华法林维持剂量。我们已经开发了一种多步骤方法,可以选择SNP,使用选定SNP的不同子集以及已知的相关遗传和环境变量构建预测模型,并在交叉验证框架中测试发现的模型。初步结果表明,我们的建模方法比以前的模型能够更准确地预测华法林剂量。 200个SNP的模型大小(除了已知的遗传和环境变量之外)可提供最佳准确性。在该模型中,华法林剂量的预测和实际平方根之间的 R 2 对于RFR平均为66.4%,对于SVR为57.8%,对于BRT为56.9%。因此,RFR的准确度最高,但是三种技术均比当前公布的 R 2 更好,在混合种族样本中为43%,在非洲样本中为27%美国样本。总之,机器学习用于高维药物遗传学预测以及临床连续性特征预测的方法具有广阔的前景,值得进一步研究。>联系方式: >补充信息:可在生物信息学在线获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号