机译
从数字正确分数计算IRT分数的两种方法的比较
摘要:Two estimates for item response theory latent trait scores (θ) based on the summed, number-correct score, X, were compared: (a) the so-called test characteristic curve (TCC) estimates, θTCC, in which the TCC is inverted so that a value of θ can be estimated directly from X and (b) the expected a posteriori—or Bayesian posterior mean—estimates, θEAP. Using data from Tenth-Grade English and Math Tests, the conditional, expected values for θTCC and θEAP (using both normal N(0, 1) and N(0, 10) priors), along with their conditional standard errors, were computed and plotted against a grid of actual θs. Under a normal N(0, 1) prior, it was found that the Bayesian θEAPs showed considerably smaller standard errors of measurement compared with the θTCCs—especially in the tails of the θ-distribution. However, the bias of the θEAPs based on the N(0, 1) prior was substantial in the extremes of the distribution of θ. The normal N(0, 10) prior for computing the θEAPs reduced their bias but increased their standard error—These were not unexpected statistical results, given the nearly universal trade-off between bias and standard error. The choice among the three summed-score θ-estimates examined here depends largely on which of the two major sources of distortion—bias versus standard error—is the more harmful.