Evaluation of Phenotype Classification Methods for Obesity Using Direct to Consumer Genetic Data

机译：用直接向消费者遗传数据评估肥胖表型分类方法的评价

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Direct-to-Consumer genetic testing services are becoming more ubiquitous. Consumers of such services are sharing their genetic and clinical information with the research community to facilitate the extraction of knowledge about different conditions. In this paper, we build on these services to analyse the genetic data of people with different BMI levels to determine the immediate and long-term risk factors associated with obesity. Using web scraping techniques, a dataset containing publicly available information about 230 participants from the Personal Genome Project is created. Subsequent analysis of the dataset is conducted for the identification of genetic variants associated with high BMI levels via standard quality control and association analysis protocols for Genome Wide Association Analysis. We applied a combination of Random Forest based feature selection algorithm and Support Vector Machine with Radial Basis Function Kernel learning method to the filtered dataset. Using a robust data science methodology our approach identified obesity related genetic variants, to be used as features when predicting individual obesity susceptibility. The results reveal that the subset of features obtained through the Random Forest based algorithm improve the performance of the classifier when compared to the top statistically significant genetic variants identified in logistic regression. Support Vector Machine showed the best results with sensitivity=81%, specificity=83% and area under the curve=92% when the model was trained with the top fifteen features selected by Boruta.

机译：直接消费者遗传检测服务变得更加普遍存在。这些服务的消费者正在与研究界共享其遗传和临床信息，以便于提取关于不同条件的知识。在本文中，我们建立了这些服务，分析了不同BMI水平的人的遗传数据，以确定与肥胖相关的立即和长期的风险因素。使用Web刮擦技术，创建了一个与个人基因组项目中有关230个参与者的公开信息的数据集是。通过标准质量控制和关联分析协议对基因组宽关联分析的标准质量控制和关联分析方案进行与高BMI水平相关的遗传变体进行后续分析。我们应用了随机林的特征选择算法的组合，支持向量机的径向基函数内核学习方法到过滤数据集。使用稳健的数据科学方法，我们的方法确定了肥胖相关的遗传变体，当预测个人肥胖易感性时被用作特征。结果表明，通过随机林的算法获得的特征子集提高了分类器的性能，与在逻辑回归中识别的顶部统计学上显着的遗传变体相比。支持向量机显示灵敏度的最佳效果= 81％，特异性= 83％，曲线下的面积= 92％= 92％，当博鲁塔选择的前十五个特征培训时培训。

著录项

来源
《International Conference on Intelligent Computing》|2017年|841p|共13页
会议地点
作者
Casimiro Aday Curbelo Montanez; Paul Fergus; Abir Hussain; Dhiya Al-Jumeily; Mehmet Tevfik Dorak; Rosni Abdullah;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词
Bioinformatics; Data science; Machine learning; Feature selection; Genetics; Obesity SNPs;

机译：生物信息学;数据科学;机器学习;特征选择;遗传学;肥胖症SNPS;

相似文献

外文文献
中文文献
专利

1. Depot- and obesity-related differences in adipogenesisAdipocyte hypertrophy and hyperplasia are known to facilitate lipid storage in adipose tissues by increasing adipocyte cell size and number, respectively. Adipogenesis is the process resulting in adipose tissue hyperplasia. Although depot-specific differences and obesity-related modulation of adipocyte size are well documented, available data on adipogenesis and adipose tissue hyperplasia are less conclusive. Most studies support a reduction of adipogenesis in the obese state. Preadipocytes of the subcutaneous fat depot appear to be more responsive to adipogenic stimulation compared with those from visceral fat compartments in most studies. A number of studies support the notion that adipose tissue expansion through hyperplasia reduces ectopic lipid excess and obesity-related complications. Several genetic variants have been identified in the genes coding for adipogenesis-regulating proteins. While some of these variants have been clearly associated with the phenotypes of obesity and obesity-related alterations, available data highlight the importance of considering gene–gene and gene–diet interactions. [J] . Julie. Lessard, André. Tchernof Clinical lipidology. . 2012,第5期

机译：脂肪形成与肥胖相关的差异已知脂肪细胞肥大和增生分别通过增加脂肪细胞的大小和数量来促进脂质在脂肪组织中的存储。脂肪形成是导致脂肪组织增生的过程。尽管已经有很多文献记载了贮库特异性差异和肥胖相关的脂肪细胞大小调节，但有关脂肪形成和脂肪组织增生的可用数据尚无定论。大多数研究支持在肥胖状态下减少脂肪形成。在大多数研究中，与来自内脏脂肪区室的脂肪细胞相比，皮下脂肪库的前脂肪细胞似乎对脂肪刺激更为敏感。许多研究支持这样的观点，即通过增生的脂肪组织扩张可以减少异位脂质过多和肥胖相关的并发症。在编码脂肪形成调节蛋白的基因中已经鉴定出几种遗传变异。尽管其中一些变异与肥胖症的表型和与肥胖有关的改变明显相关，但现有数据突出了考虑基因-基因和基因-饮食相互作用的重要性。
2. Direct-to-Consumer Genetic Testing Data Privacy: Key Concerns and Recommendations Based on Consumer Perspectives [J] . Rachele M. Hendricks-Sturrup, Christine Y. Lu Journal of Personalized Medicine . 2019,第2期

机译：直接面向消费者的基因测试数据隐私：基于消费者观点的关键问题和建议
3. DETERMINANTS OF INTENTIONS IN TAKING A DIRECT-TO-CONSUMER GENETIC TEST FOR THE OBESITY GENE: A TEST OF THE INTEGRATIVE MODEL [J] . Dong Yue, Branscum Paul Annals of behavioral medicine : . 2018,第Suppla1期

机译：针对肥胖基因直接消费遗传检测的意图决定因素：综合模型的试验
4. Evaluation of Phenotype Classification Methods for Obesity Using Direct to Consumer Genetic Data [C] . Casimiro Aday Curbelo Montanez, Paul Fergus, Abir Hussain, International conference on intelligent computing . 2017

机译：直接针对消费者的遗传数据对肥胖表型分类方法的评估
5. Comparative Analysis of Feature Selection and Classification Methods for Epigenetic Methylation Data [D] . Kleyn, Aaron. 2021

机译：表观甲基化数据特征选择和分类方法的比较分析
6. Direct-to-Consumer Genetic Testing Data Privacy: Key Concerns and Recommendations Based on Consumer Perspectives [O] . Rachele M. Hendricks-Sturrup, Christine Y. Lu 2019

机译：直接面向消费者的基因测试数据隐私：基于消费者观点的关键问题和建议
7. Evaluation of Phenotype Classification Methods for Obesity using Direct to Consumer Genetic Data [O] . Curbelo Montañez CA, Fergus P, Hussain A, 100

机译：使用直接到消费者的遗传数据评估肥胖的表型分类方法

Evaluation of Phenotype Classification Methods for Obesity Using Direct to Consumer Genetic Data

摘要

著录项

相似文献

相关主题

期刊订阅