首页> 外文期刊>Medical teacher >Building reliable and generalizable clerkship competency assessments: Impact of 'hawk-dove' correction
【24h】

Building reliable and generalizable clerkship competency assessments: Impact of 'hawk-dove' correction

机译:建立可靠和可推广的文员能力评估:“鹰鸽”修正的影响

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Purpose Systematic differences among raters' approaches to student assessment may result in leniency or stringency of assessment scores. This study examines the generalizability of medical student workplace-based competency assessments including the impact of rater-adjusted scores for leniency and stringency. Methods Data were collected from summative clerkship assessments completed for 204 students during 2017-2018 the clerkship at a single institution. Generalizability theory was used to explore variance attributed to different facets (rater, learner, item, and competency domain) through three unbalanced random-effects models by clerkship including applying assessor stringency-leniency adjustments. Results In the original assessments, only 4-8 of the variance was attributed to the student with the remainder being rater variance and error. Aggregating items to create a composite score increased variability attributable to the student (5-13 of variance). Applying a stringency-leniency ('hawk-dove') correction substantially increased the variance attributed to the student (14.8-17.8) and reliability. Controlling for assessor leniency/stringency reduced measurement error, decreasing the number of assessments required for generalizability from 16-50 to 11-14. Conclusions Similar to prior research, most of the variance in competency assessment scores was attributable to raters, with only a small proportion attributed to the student. Making stringency-leniency corrections using rater-adjusted scores improved the psychometric characteristics of assessment scores.
机译:目的 评估者对学生评估方法的系统性差异可能导致评估分数的宽松或严格。本研究考察了基于工作场所的医学生能力评估的普遍性,包括评分者调整后的宽大和严格分数的影响。方法 从2017-2018年期间为204名学生完成的总结性见习评估中收集数据。采用泛化理论,通过3个不平衡随机效应模型,包括应用评估者严格性-宽大性调整,探索归因于不同方面(评分者、学习者、项目和能力领域)的方差。结果 在最初的评估中,只有 4-8% 的方差归因于学生,其余是评分者方差和误差。汇总项目以创建综合分数会增加可归因于学生的变异性(方差的 5-13%)。应用严格-宽大(“鹰-鸽”)校正大大增加了归因于学生的方差(14.8-17.8%)和可靠性。控制评估员的宽松度/严格度减少了测量误差,将可推广性所需的评估数量从 16-50 减少到 11-14。结论 与之前的研究类似,能力评估分数的大部分差异归因于评分者,只有一小部分归因于学生。使用评分者调整的分数进行严格-宽大纠正,改善了评估分数的心理测量特征。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号