A Meta-Analysis of Machine Learning-Based Science Assessments: Factors Impacting Machine-Human Score Agreements

Zhai Xiaoming; Shi Lehong; Nehm Ross H.

首页> 外文期刊>Journal of Science Education and Technology >A Meta-Analysis of Machine Learning-Based Science Assessments: Factors Impacting Machine-Human Score Agreements

【24h】

A Meta-Analysis of Machine Learning-Based Science Assessments: Factors Impacting Machine-Human Score Agreements

机译：基于机器学习的科学评估的META分析：影响机器人体评分协议的因素

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Machine learning (ML) has been increasingly employed in science assessment to facilitate automatic scoring efforts, although with varying degrees of success (i.e., magnitudes of machine-human score agreements [MHAs]). Little work has empirically examined the factors that impact MHA disparities in this growing field, thus constraining the improvement of machine scoring capacity and its wide applications in science education. We performed a meta-analysis of 110 studies of MHAs in order to identify the factors most strongly contributing to scoring success (i.e., high Cohen's kappa [kappa]). We empirically examined six factors proposed as contributors to MHA magnitudes: algorithm, subject domain, assessment format, construct, school level, and machine supervision type. Our analyses of 110 MHAs revealed substantial heterogeneity in kappa(mean=.64; range = .09-.97, taking weights into consideration). Using three-level random-effects modeling, MHA score heterogeneity was explained by the variability both within publications (i.e., the assessment task level: 82.6%) and between publications (i.e., the individual study level: 16.7%). Our results also suggest that all six factors have significant moderator effects on scoring success magnitudes. Among these, algorithm and subject domain had significantly larger effects than the other factors, suggesting that technical features and assessment external features might be primary targets for improving MHAs and ML-based science assessments.

机译：机器学习（ML）越来越多地用于科学评估，以便于自动评分努力，尽管有不同程度的成功（即机器人体评分协议[MHAS]）。一点工作已经经验检查了影响这种成长领域的MHA差异的因素，从而限制了机器评分能力的提高及其在科学教育中的广泛应用。我们对MHA的110项研究进行了META分析，以确定最强烈促进成功的因素（即，High Cohen的Kappa [Kappa]）。我们经验审查了六个因素，提出了MHA级数的贡献者：算法，主题领域，评估格式，构建，学校等级和机器监督类型。我们的110 MHA分析显示Kappa（平均值= .64;范围= .09-.97，考虑重量）。使用三级随机效应建模，通过出版物（即评估任务水平：82.6％）和出版物之间的可变性来解释MHA评分异质性，并在出版物之间（即个别研究水平：16.7％）。我们的研究结果还表明，所有六个因素都对得分成功幅度有显着的主持人效应。其中，算法和主题域具有比其他因素更大的效果，这表明技术特征和评估外部特征可能是改善MHA和基于ML的科学评估的主要目标。

著录项

来源
《Journal of Science Education and Technology》 |2021年第3期|361-379|共19页
作者
Zhai Xiaoming; Shi Lehong; Nehm Ross H.;
展开▼
作者单位

Univ Georgia Dept Math & Sci Educ Athens GA 30602 USA;

Univ Georgia Dept Career & Informat Studies Athens GA 30605 USA;

SUNY Stony Brook Dept Ecol & Evolut Stony Brook NY 11794 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Machine learning; Science assessment; Meta-analysis; Interrater reliability; Validity; Cohen's kappa; Artificial Intelligence;

机译：机器学习;科学评估;荟萃分析;Interrater可靠性;有效性;科恩的Kappa;人工智能;

相似文献

外文文献
中文文献
专利

1. On the Validity of Machine Learning-based Next Generation Science Assessments: A Validity Inferential Network [J] . Zhai Xiaoming, Krajcik Joseph, Pellegrino James W. Journal of Science Education and Technology . 2021,第2期

机译：基于机器学习的下一代科学评估的有效性：有效性推理网络
2. The impact factors on the performance of machine learning-based vulnerability detection: A comparative study [J] . Wei Zheng, Jialiang Gao, Xiaoxue Wu, The Journal of Systems and Software . 2020,第Octa期

机译：基于机器学习漏洞检测的影响因素：比较研究
3. From substitution to redefinition: A framework of machine learning-based science assessment [J] . Zhai Xiaoming, C. Haudek Kevin, Shi Lehong, Journal of research in science teaching . 2020,第9期

机译：从替代到重新定义：基于机器学习的科学评估框架
4. How Much Choice Is Too Much? A Machine Learning-Based Meta-Analysis of Choice Overload [C] . Nan Zhang, Heng Xu AMA Summer Academic Conference . 2020

机译：多少选择太多了？基于机器学习的META分析选择过载
5. Examining alternative scoring rubrics on a statewide test: The impact of different scoring methods on science and social studies performance assessments. [D] . Creighton, Susan Dabney. 2006

机译：在州范围内的测试中检查替代性评分标准：不同评分方法对科学和社会研究绩效评估的影响。
6. A low-cost machine learning-based cardiovascular/stroke risk assessment system: integration of conventional factors with image phenotypes [O] . Ankush Jamthikar, Deep Gupta, Narendra N. Khanna, 2019

机译：一种低成本的基于机器学习的心血管/中风风险评估系统：传统因素与图像表型的整合
7. P1246Age-, body size-, and sex-specific reference values for tricuspid valve apparatus parameters by real-time three-dimensional transthoracic echocardiographyP1247Surgical therapy in infective endocarditis: predisposing and prognostic factorsP1248Evalutation of MV annulus geometry modifications before and after mitraclip implantation with 3D transesophageal echocardiographyP1249Prognostic echocardiographic parameters for successful transcatheter edge-to-edge mitral valve repairP1250Influence of body surface area in diagnosis and outcome patient-prosthesis mismatch after TAVI. Is it necessary to adjust for ideal body weight?P1251Outcome of paradoxical low-flow low-gradient severe AS following TAVI: paradoxical VS parallel to outcome of high gradient ASP1252Prognostic role of cardiac calcifications in subjects without overt significant heart disease and previous cardiovascular events: is there an additive value over cardiovascular risk factors scores?P1253Aging degeneration of mitral valve without leaflet prolapse or tethering as a cause of severe mitral regurgitationP1254Cardiovascular risk assessment should be performed in patients with incidental aortic sclerosisP1255Cardiac valve calcification score: simple 2D echocardiographic tool for cardiovascular risk stratification in patients with chronic renal failureP1256Diastolic mitral regurgitation following transcatheter aortic valve replacement: prevalence, predictors and impact on outcomesP1257Timing of myocardial shortening determines left ventricular regional remodelling in hearts with conduction delaysP1258Myocardial stiffness assessment using shear wave imaging in healthy children and hypertrophic cardiomyopathyP1259Assessment of right ventricular response to exercise using vector velocity imaging in hypertrophic cardiomyopathyP1260Diagnostic value of cardiac biomarkers (high sensitive troponin T and N-terminal pro-brain natriuretic peptide) in patients with infiltrative cardiomyopathiesP1261Right ventricular global longitudinal strain differentiates between adolescent patients with definite, borderline and possible ARVC and controlsP1262 echogrqphic predictif factors of atrial arrhythmias in patients with arrhythmogenic right ventricular dysplasiaP1263Importance of combined left atrial size and estimated pulmonary pressure for clinical outcome in patients presenting with heart failure with preserved ejection fractionP1264Chronotropic response in mitochondrial (ANT1) cardiomyopathyP1265 The effect of cardiovascular risk factors and cancer type in anthracycline's cardiotoxicityP1266Revisiting the echocardiographic features of acute myocarditisP1267Altered circulation in fetuses of diabetic mothers: a fetal computational model analysisP1268Increased epicardial adipose tissue is related to stress-induced cardiomyopathyP1269Contrast enhanced ultrasound of kidney perfusion in patients with renal artery stenosis: diagnostic and prognostic applications. [O] . F. Ancona, R. Ilhao Moreira, V. Lavanco, 2016

机译：P1246age-，身体大小和性别特异性参考Tricuspid阀装置参数通过实时三维经线性超声心动图124772777777747774747478EVALIGUTALMEXP1249预期植入前后MV环形植入前后MV环空性修饰。超声心动图参数用于成功的经截管边缘到边缘二尖瓣修复P1250中的体表面积诊断和结果患者假体在Tavi后的不匹配。是否有必要调整理想的体重？P1251矛盾的低流量低梯度严重，如Tavi：矛盾的VS平行于高梯度ASP1252的结果，在没有明显显着的心脏病和以前心血管事件的情况下心脏钙化的促傲作用患有心血管危险因素的添加剂值分数？没有传单脱垂的二尖瓣退化或作为严重二尖瓣血管血管血管病风险评估的患者进行脱垂或系绳，应当在偶然主动脉源肺源P1255Cariac阀钙化评分患者中进行：用于心血管风险分层的简单2D超声心动图工具在慢性肾功能衰竭的患者中，经沟管主动脉瓣膜置换术后：流行率，预测因子和对Otcomasp1257的影响，心肌缩短决定了导通延迟的心中的剩余心室区域重新耦合258MYCIARIAL僵硬评估使用剪切波成像在健康儿童和肥厚性心肌脑病中的舒缓性心脏响应，在渗透心肌肌瘤患者中使用载体速度成像的培养速度成像（高敏感的肌钙蛋白T和N-末端脑Natrietic肽）渗透心肌病，患者心室全局纵向菌株在患有心律失常右心室发育不良患者的心理心律失常患者中的青少年患者与Controlsp1262 echogrqphic预测因子的分歧，患者心血性右心室发育不良患者患者患者患者患者患者患者患者患者患者线粒体（ANT1）心肌病患者的喷射FRACTIONP1264系列响应1265心血管危险因素和癌症类型在蒽丙烯的心肌毒性皮1126660中的影响访问患有糖尿病母亲胎儿胎儿急性心肌炎的超声心动图，患者患者：胎儿计算模型分析P1268增加外膜脂肪组织与肾动脉狭窄患者肾脏灌注的应激诱导的心肌病，增强超声波：诊断和预后应用。

A Meta-Analysis of Machine Learning-Based Science Assessments: Factors Impacting Machine-Human Score Agreements

摘要

著录项

相似文献

相关主题

期刊订阅