首页> 外文期刊>Translational psychiatry. >Machine learning for effectively avoiding overfitting is a crucial strategy for the genetic prediction of polygenic psychiatric phenotypes
【24h】

Machine learning for effectively avoiding overfitting is a crucial strategy for the genetic prediction of polygenic psychiatric phenotypes

机译:有效避免过度装备的机器学习是多基因精神病学表型遗传预测的重要策略

获取原文
           

摘要

The accuracy of previous genetic studies in predicting polygenic psychiatric phenotypes has been limited mainly due to the limited power in distinguishing truly susceptible variants from null variants and the resulting overfitting. A novel prediction algorithm, Smooth-Threshold Multivariate Genetic Prediction (STMGP), was applied to improve the genome-based prediction of psychiatric phenotypes by decreasing overfitting through selecting variants and building a penalized regression model. Prediction models were trained using a cohort of 3685 subjects in Miyagi prefecture and validated with an independently recruited cohort of 3048 subjects in Iwate prefecture in Japan. Genotyping was performed using HumanOmniExpressExome BeadChip Arrays. We used the target phenotype of depressive symptoms and simulated phenotypes with varying complexity and various effect-size distributions of risk alleles. The prediction accuracy and the degree of overfitting of STMGP were compared with those of state-of-the-art models (polygenic risk scores, genomic best linear-unbiased prediction, summary-data-based best linear-unbiased prediction, BayesR, and ridge regression). In the prediction of depressive symptoms, compared with the other models, STMGP showed the highest prediction accuracy with the lowest degree of overfitting, although there was no significant difference in prediction accuracy. Simulation studies suggested that STMGP has a better prediction accuracy for moderately polygenic phenotypes. Our investigations suggest the potential usefulness of STMGP for predicting polygenic psychiatric conditions while avoiding overfitting.
机译:先前遗传研究预测多种子基精神表型的准确性主要是有限的,主要是因为区分真正易感变体的功率有限,从而从零变体和所产生的过度喂食。一种新的预测算法,平滑阈值多元遗传预测(STMGP),通过选择变体和建立惩罚回归模型来改善精神表型的基于基于面基因表型的预测。使用宫城县的3685个受试者的3685个受试者进行预测模型,并在日本的岩手县独立招募的3048名科目核准。使用人脑鳞蜂蛋白酶珠芯片阵列进行基因分型。我们使用抑郁症状和模拟表型的靶表型,具有不同的复杂性和各种风险等位基因的效应尺寸分布。与最先进的模型(多基因风险评分,基因组最佳线性 - 无偏见,摘要 - 基于数据的最佳线性 - 无偏见预测,Bayesr和Ridge)比较了STMGP的预测精度和超拟合程度回归)。在预测抑郁症状中,与其他模型相比,STMGP显示了最高的预装,尽管预测准确性没有显着差异。仿真研究表明,STMGP对中等多种子型表型具有更好的预测准确性。我们的调查表明,STMGP在避免过度拟合的同时预测多基因精神病条件的潜在有用性。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号