首页> 外文期刊>Advances in health sciences education: theory and practice >Inter-rater reliability and generalizability of patient note scores using a scoring rubric based on the USMLE Step-2 CS format
【24h】

Inter-rater reliability and generalizability of patient note scores using a scoring rubric based on the USMLE Step-2 CS format

机译:使用基于USMLE Step-2 CS格式的评分标准,评分者之间的信度和普遍性

获取原文
获取原文并翻译 | 示例
           

摘要

Recent changes to the patient note (PN) format of the United States Medical Licensing Examination have challenged medical schools to improve the instruction and assessment of students taking the Step-2 clinical skills examination. The purpose of this study was to gather validity evidence regarding response process and internal structure, focusing on inter-rater reliability and generalizability, to determine whether a locally-developed PN scoring rubric and scoring guidelines could yield reproducible PN scores. A randomly selected subsample of historical data (post-encounter PN from 55 of 177 medical students) was rescored by six trained faculty raters in November-December 2014. Inter-rater reliability (% exact agreement and kappa) was calculated for five standardized patient cases administered in a local graduation competency examination. Generalizability studies were conducted to examine the overall reliability. Qualitative data were collected through surveys and a rater-debriefing meeting. The overall inter-rater reliability (weighted kappa) was .79 (Documentation = .63, Differential Diagnosis = .90, Justification = .48, and Workup = .54). The majority of score variance was due to case specificity (13 %) and case-task specificity (31 %), indicating differences in student performance by case and by case-task interactions. Variance associated with raters and its interactions were modest (< 5 %). Raters felt that justification was the most difficult task to score and that having case and level-specific scoring guidelines during training was most helpful for calibration. The overall inter-rater reliability indicates high level of confidence in the consistency of note scores. Designs for scoring notes may optimize reliability by balancing the number of raters and cases.
机译:美国医学许可考试对患者备忘(PN)格式的最新更改已对医学院提出了挑战,要求其改进对参加第二步临床技能考试的学生的指导和评估。这项研究的目的是收集有关应答过程和内部结构的有效性证据,重点是评分者之间的信度和可概括性,以确定本地开发的PN评分标准和评分指南是否可以产生可重复的PN评分。随机选择的历史数据子样本(来自177名医学生中的55名的事后PN)在2014年11月至12月由六名受过训练的教师评估者进行了评分。评估了五种标准化患者病例的评估者间信度(%准确一致和kappa)在当地的毕业能力考试中进行管理。进行了泛化研究以检查整体可靠性。通过调查和评估者汇报会议收集了定性数据。评估者之间的整体可靠性(加权kappa)为0.79(文献= 0.63,鉴别诊断= 0.90,正当性= 0.48,检查= 0.54)。得分差异的大部分归因于案例特异性(13%)和案例任务特异性(31%),表明学生的学习成绩因案例和案例任务互动而异。与评估者及其相互作用相关的方差很小(<5%)。评分者认为,合理性是最困难的评分任务,并且在培训过程中掌握案例和特定级别的评分准则对校准最有帮助。整体评分者间的可靠性表明,对音符分数的一致性有很高的置信度。评分笔记的设计可以通过平衡评估者和案例的数量来优化可靠性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号