Suitability of random forest analysis for epidemiological research: Exploring sociodemographic and lifestyle-related risk factors of overweight in a cross-sectional design

Kanerva Noora; Kontto Jukka; Erkkola Maijaliisa; Nevalainen Jaakko; Mannisto Satu

首页> 外文期刊>Scandinavian journal of public health >Suitability of random forest analysis for epidemiological research: Exploring sociodemographic and lifestyle-related risk factors of overweight in a cross-sectional design

【24h】

Suitability of random forest analysis for epidemiological research: Exploring sociodemographic and lifestyle-related risk factors of overweight in a cross-sectional design

机译：流行病学研究随机森林分析的适用性：探讨横截面设计超重超重的社会造影和生活方式相关的风险因素

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Aims: Factors that contribute to the development of overweight are numerous and form a complex structure with many unknown interactions and associations. We aimed to explore this structure (i.e. the mutual importance or hierarchy of sociodemographic and lifestyle-related risk factors of being overweight) using a machine-learning technique called random forest (RF). The results were compared with traditional logistic regression (LR) analysis. Methods: The cross-sectional FINRISK 2007 Study included 4757 Finns (aged 25-74 years). Information on participants' lifestyle and sociodemographic characteristics were collected with questionnaires. Diet was assessed, using a validated food-frequency questionnaire. Height and weight were measured. Participants with a body mass index (BMI) 25 kg/m(2) were classified as overweight. R-statistical software was used to run RF analysis (randomForest') to derive estimates for variable importance and out-of-bag error, which were compared to a LR model. Results: In total, 704 (32%) men and 1119 (44%) women had normal BMI, whereas 1502 (69%) men and 1432 (57%) women had BMI 25. Estimated error rates for the models were similar (RF vs. LR: 42% vs. 40% for men, 38% vs. 35% for women). Both models ranked age, education and physical activity as the most important risk factors for being overweight, but RF ranked macronutrients (carbohydrates and protein) as more important compared to LR. Conclusions: RF did not demonstrate higher power in variable selection compared to LR in our study. The features of RF are more likely to appear beneficial in settings with a larger number of predictors.

机译：目的：有助于发展超重的因素是众多，形成复杂的结构，具有许多未知的相互作用和关联。我们的目标是探索这种结构（即，使用称为随机森林（RF）的机器学习技术，使用机器学习技术来探索这种结构（即超重）的互相重要性或相互相关的风险因素的层次。结果与传统的逻辑回归（LR）分析进行了比较。方法：横截面FinRISK 2007研究包括4757芬兰（25-74岁）。有关参与者的生活方式和社会渗目特征的信息是用问卷收集的。使用经过验证的食物频率问卷评估饮食。测量身高和体重。体重指数（BMI）25公斤/米（2）的参与者被归类为超重。 R型统计软件用于运行RF分析（随机速率'）以导出可变重要性和禁止袋错误的估计，这与LR模型进行比较。结果：总共704名（32％）男性和1119名（44％）女性具有正常的BMI，而1502（69％）男性和1432名（57％）女性BMI 25.模型的估计错误率类似（RF与LR：男性为42％，女性38％与35％）。型号为年龄，教育和身体活动，作为超重的最重要的危险因素，但与LR相比，RF排名Macronrients（碳水化合物和蛋白质）和更重要的。结论：与我们研究中的LR相比，RF在可变选择中没有表现出更高的功率。 RF的特征更有可能在具有更大数量的预测器的环境中看起来有益。

著录项

来源
《Scandinavian journal of public health》 |2018年第5期|共8页
作者
Kanerva Noora; Kontto Jukka; Erkkola Maijaliisa; Nevalainen Jaakko; Mannisto Satu;
展开▼
作者单位

Univ Helsinki Dept Publ Hlth POB 20 Helsinki 00140 Finland;

Natl Inst Hlth &

Welf Dept Publ Hlth Solut Helsinki Finland;

Univ Helsinki Nutr Unit Helsinki Finland;

Univ Tampere Sch Hlth Sci Tampere Finland;

Natl Inst Hlth &

Welf Dept Publ Hlth Solut Helsinki Finland;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类城市居住卫生;
关键词
Machine learning; mutual importance; obesity; random forest; risk factor;

机译：机器学习;互肥大;肥胖;随机森林;危险因素;

相似文献

外文文献
中文文献
专利

1. Suitability of random forest analysis for epidemiological research: Exploring sociodemographic and lifestyle-related risk factors of overweight in a cross-sectional design [J] . Kanerva Noora, Kontto Jukka, Erkkola Maijaliisa, Scandinavian journal of public health . 2018,第5期

机译：流行病学研究随机森林分析的适用性：探讨横截面设计超重超重的社会造影和生活方式相关的风险因素
2. Sociodemographic determinants and clinical risk factors associated with COVID-19 severity: a cross-sectional analysis of over 200,000 patients in Tehran, Iran [J] . Sohrabi Mohammad-Reza, Amin Rozhin, Maher Ali, BMC Infectious Diseases . 2021,第1期

机译：与Covid-19严重程度相关的社会渗目法决定因素和临床风险因素：伊朗德黑兰200,000例患者的横截面分析
3. Evaluation of a computer-assisted multi-professional intervention to address lifestyle-related risk factors for overweight and obesity in expecting mothers and their infants: protocol for an effectiveness-implementation hybrid study [J] . Adrienne Alayli, Franziska Krebs, Laura Lorenz, BMC Public Health . 2020,第1期

机译：评估计算机辅助的多专业干预，以满足生活方式相关的危险因素，以期待母亲及其婴儿的超重和肥胖：有效实施混合研究的协议
4. Prevalence and Lifestyle Risk Factors of Overweight and Obesity Among Indonesian Adolescents: An Analysis of Global School-Based Health Survey 2007 and 2015 [C] . Prisca Petty Arfines, Harry Freitag Luglio, Nunik Kusumawardani International Symposium on Health Research;National Congress of The Indonesian Public Health Association . 2020

机译：印度尼西亚青少年超重和肥胖的患病率和生活方式风险因素：2007年和2015年全球校本健康调查分析
5. Surrogate Measures of Road Safety: A Cross-Sectional Analysis with Traditional Risk Factors and Operating Characteristics at Signalized Intersections [D] . Khanal, Bedan. 2021

机译：代理道路安全措施：具有传统风险因素的横截面分析和信号交叉口的运行特征
6. Sociodemographic and lifestyle-related risk factors for identifying vulnerable groups for type 2 diabetes: a narrative review with emphasis on data from Europe [O] . Ioannis Kyrou, Constantine Tsigos, Christina Mavrogianni, 2020

机译：社会人口统计学和与生活方式有关的危险因素用于识别2型糖尿病的易感人群：叙述性综述重点是来自欧洲的数据
7. Sociodemographic and lifestyle-related risk factors for identifying vulnerable groups for type 2 diabetes: a narrative review with emphasis on data from Europe [O] . Ioannis Kyrou, Constantine Tsigos, Christina Mavrogianni, 2020

机译：社会阶段和生活方式相关的风险因素识别2型糖尿病患者的弱势群体：叙述审查重点是来自欧洲的数据

Suitability of random forest analysis for epidemiological research: Exploring sociodemographic and lifestyle-related risk factors of overweight in a cross-sectional design

摘要

著录项

相似文献

相关主题

期刊订阅