Inter-rater reliability and generalizability of patient note scores using a scoring rubric based on the USMLE Step-2 CS format

Park Yoon Soo; Hyderi Abbas; Bordage Georges; Xing Kuan; Yudkowsky Rachel

首页> 外文期刊>Advances in health sciences education: theory and practice >Inter-rater reliability and generalizability of patient note scores using a scoring rubric based on the USMLE Step-2 CS format

【24h】

Inter-rater reliability and generalizability of patient note scores using a scoring rubric based on the USMLE Step-2 CS format

机译：使用基于USMLE Step-2 CS格式的评分标准，评分者之间的信度和普遍性

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recent changes to the patient note (PN) format of the United States Medical Licensing Examination have challenged medical schools to improve the instruction and assessment of students taking the Step-2 clinical skills examination. The purpose of this study was to gather validity evidence regarding response process and internal structure, focusing on inter-rater reliability and generalizability, to determine whether a locally-developed PN scoring rubric and scoring guidelines could yield reproducible PN scores. A randomly selected subsample of historical data (post-encounter PN from 55 of 177 medical students) was rescored by six trained faculty raters in November-December 2014. Inter-rater reliability (% exact agreement and kappa) was calculated for five standardized patient cases administered in a local graduation competency examination. Generalizability studies were conducted to examine the overall reliability. Qualitative data were collected through surveys and a rater-debriefing meeting. The overall inter-rater reliability (weighted kappa) was .79 (Documentation = .63, Differential Diagnosis = .90, Justification = .48, and Workup = .54). The majority of score variance was due to case specificity (13 %) and case-task specificity (31 %), indicating differences in student performance by case and by case-task interactions. Variance associated with raters and its interactions were modest (< 5 %). Raters felt that justification was the most difficult task to score and that having case and level-specific scoring guidelines during training was most helpful for calibration. The overall inter-rater reliability indicates high level of confidence in the consistency of note scores. Designs for scoring notes may optimize reliability by balancing the number of raters and cases.

机译：美国医学许可考试对患者备忘（PN）格式的最新更改已对医学院提出了挑战，要求其改进对参加第二步临床技能考试的学生的指导和评估。这项研究的目的是收集有关应答过程和内部结构的有效性证据，重点是评分者之间的信度和可概括性，以确定本地开发的PN评分标准和评分指南是否可以产生可重复的PN评分。随机选择的历史数据子样本（来自177名医学生中的55名的事后PN）在2014年11月至12月由六名受过训练的教师评估者进行了评分。评估了五种标准化患者病例的评估者间信度（％准确一致和kappa）在当地的毕业能力考试中进行管理。进行了泛化研究以检查整体可靠性。通过调查和评估者汇报会议收集了定性数据。评估者之间的整体可靠性（加权kappa）为0.79（文献= 0.63，鉴别诊断= 0.90，正当性= 0.48，检查= 0.54）。得分差异的大部分归因于案例特异性（13％）和案例任务特异性（31％），表明学生的学习成绩因案例和案例任务互动而异。与评估者及其相互作用相关的方差很小（<5％）。评分者认为，合理性是最困难的评分任务，并且在培训过程中掌握案例和特定级别的评分准则对校准最有帮助。整体评分者间的可靠性表明，对音符分数的一致性有很高的置信度。评分笔记的设计可以通过平衡评估者和案例的数量来优化可靠性。

著录项

来源
《Advances in health sciences education: theory and practice》 |2016年第4期|共13页
作者
Park Yoon Soo; Hyderi Abbas; Bordage Georges; Xing Kuan; Yudkowsky Rachel;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类学校管理;
关键词
Patient note; USMLE Step-2 CS; Validity; Rater effects;

机译：患者注意事项;USMLE步骤2 CS;有效性;效果;

相似文献

外文文献
中文文献
专利

1. Inter-rater reliability and generalizability of patient note scores using a scoring rubric based on the USMLE Step-2 CS format [J] . Park Yoon Soo, Hyderi Abbas, Bordage Georges, Advances in health sciences education: theory and practice . 2016,第4期

机译：使用基于USMLE Step-2 CS格式的评分标准，评分者之间的信度和普遍性
2. Validity evidence for a patient note scoring rubric based on the new patient note format of the United States medical licensing examination [J] . ParkY.S., LineberryM., HyderiA., Academic Medicine: Journal of the Association of American Medical Colleges . 2013,第10期

机译：基于美国医疗许可考试新的患者笔记格式，对患者笔记评分评分的有效性证据
3. Validity evidence for a patient note scoring rubric based on the new patient note format of the United States medical licensing examination [J] . ParkY.S., LineberryM., HyderiA., Academic Medicine: Journal of the Association of American Medical Colleges . 2013,第10期

机译：患者注意评分标题的有效证据基于新的患者注意到美国医疗许可考试的格式
4. From Standards to Rubrics: Comparing Full-Range to At-Level Applications of an Item-Level Scoring Rubric on an Oral Proficiency Assessment [C] . Troy L. Cox, Randall S. Davies Pacific Rim Objective Measurement Symposium . 2016

机译：从标准到rubrics：将全距离与项目级别评分标题进行比较对口头熟练程度评估的
5. Estimating the reliability of concept map ratings using a scoring rubric based on three attributes of propositions. [D] . Jimenez Snelson, Laura. 2010

机译：使用基于命题的三个属性的评分标准来估计概念图评分的可靠性。
6. High inter-rater reliability of Japanese bedriddenness ranks and cognitive function scores: a hospital-based prospective observational study [O] . Masaki Tago, Naoko E. Katsuki, Shizuka Yaita, 2021

机译：日本卧铺的高帧间可靠性等级和认知功能分数：基于医院的前瞻性观察研究
7. Translation, Inter-rater Reliability, Agreement, and Internal Consistency of the Japanese Version of the Cumulated Ambulation Score in Patients after Hip Fracture Surgery [O] . Takahisa Ogawa, Hiroto Hayashi, Toshiki Kishimoto, 2020

机译：髋关节骨折手术后患者累积的矛盾评分的日本版本版的翻译，帧间的可靠性，协议和内部一致性

Inter-rater reliability and generalizability of patient note scores using a scoring rubric based on the USMLE Step-2 CS format

摘要

著录项

相似文献

相关主题

期刊订阅