...
首页> 外文期刊>Journal of computational biology >Machine Learning-Based Method for Obesity Risk Evaluation Using Single-Nucleotide Polymorphisms Derived from Next-Generation Sequencing
【24h】

Machine Learning-Based Method for Obesity Risk Evaluation Using Single-Nucleotide Polymorphisms Derived from Next-Generation Sequencing

机译:基于机器学习的肥胖风险评估方法,该方法使用了来自下一代测序的单核苷酸多态性

获取原文
           

摘要

Obesity is a major risk factor for many metabolic diseases. To understand the genetic characteristics of obese individuals, single-nucleotide polymorphisms (SNPs) derived from next-generation sequencing (NGS) provide comprehensive insight into genome-wide genetic investigation. However, interpretation of these SNP data for clinical application is difficult given the high complexity of NGS data. Hence, in this study, obesity risk prediction models based on SNPs were designed using machine learning (ML) methods, namely support vector machine (SVM), k-nearest neighbor, and decision tree (DT). This investigation obtained clinicopathological features, including 130 SNPs, sex, and age, from 139 eligible individuals. Various feature selection methods, such as stepwise multivariate linear regression (MLR), DT, and genetic algorithms, were applied to select informative features for generating obesity prediction models. Multivariate logistic regression was used to evaluate the importance of the selected features. The models trained from various features evaluated their predictive performances based on fivefold cross-validation. Three measures, namely accuracy, sensitivity, and specificity, were used to examine and compare the predictive power among various models. To design obesity prediction models using ML methods, nine SNPs, including rs10501087, rs17700144, rs2287019, rs534870, rs660339, rs7081678, rs718314, rs9816226, and rs984222, were selected based on stepwise MLR. In evaluation of model performance, the SVM model significantly outperformed other classifiers based on the same training features. The SVM model exhibits 70.77% accuracy, 80.09% sensitivity, and 63.02% specificity. This investigation has demonstrated that the selected SNPs were effective in the detection of obesity risk. Additionally, the ML-based method provides a feasible mean for conducting preliminary analyses of genetic characteristics of obesity.
机译:肥胖是许多代谢疾病的主要危险因素。为了了解肥胖个体的遗传特征,源自下一代测序(NGS)的单核苷酸多态性(SNP)提供了对全基因组遗传研究的全面了解。但是,鉴于NGS数据的高度复杂性,很难将这些SNP数据用于临床应用。因此,在这项研究中,使用支持向量机(SVM),k最近邻和决策树(DT)等机器学习(ML)方法设计了基于SNP的肥胖风险预测模型。这项研究从139名合格个体中获得了包括130个SNP,性别和年龄在内的临床病理特征。各种特征选择方法(例如逐步多元线性回归(MLR),DT和遗传算法)被用于选择信息性特征以生成肥胖预测模型。多元逻辑回归用于评估所选特征的重要性。从各种功能训练的模型基于五重交叉验证评估了其预测性能。准确性,敏感性和特异性这三项指标用于检查和比较各种模型之间的预测能力。为了使用ML方法设计肥胖预测模型,基于逐步MLR选择了9个SNP,包括rs10501087,rs17700144,rs2287019,rs534870,rs660339,rs7081678,rs718314,rs9816226和rs984222。在模型性能评估中,基于相同的训练特征,SVM模型明显优于其他分类器。 SVM模型显示出70.7%的准确度,80.09%的灵敏度和63.02%的特异性。这项研究表明,所选的SNP可有效检测肥胖风险。此外,基于ML的方法为进行肥胖症遗传特征的初步分析提供了可行的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号