Evaluating the clinical competence of trainees remains a challenging task. For the most part, evaluations still largely rely on faculty members' observations of residents' performance, such as on the ubiquitous in-training evaluation report (ITER). However, traditional instruments are notoriously unreliable for many reasons, including, for a start, a lack of direct observations on which to base assessments. Even when performance is directly observed, reliability is often compromised by two factors: construct irrelevance, in which issues unrelated to the construct under study unduly affect the assessment, and construct under-representa-tion, whereby we assess only what we can observe and then extrapolate to the entire construct.
展开▼