...
首页> 外文期刊>Psychometrika >High-Stakes Testing Case Study: A Latent Variable Approach for Assessing Measurement and Prediction Invariance
【24h】

High-Stakes Testing Case Study: A Latent Variable Approach for Assessing Measurement and Prediction Invariance

机译:高赌注测试案例研究:评估测量和预测不变性的潜在可变方法

获取原文
获取原文并翻译 | 示例
           

摘要

The existence of differences in prediction systems involving test scores across demographic groups continues to be a thorny and unresolved scientific, professional, and societal concern. Our case study uses a two-stage least squares (2SLS) estimator to jointly assess measurement invariance and prediction invariance in high-stakes testing. So, we examined differences across groups based on latent as opposed to observed scores with data for 176 colleges and universities from The College Board. Results showed that evidence regarding measurement invariance was rejected for the SAT mathematics (SAT-M) subtest at the 0.01 level for 74.5% and 29.9% of cohorts for Black versus White and Hispanic versus White comparisons, respectively. Also, on average, Black students with the same standing on a common factor had observed SAT-M scores that were nearly a third of a standard deviation lower than for comparable Whites. We also found evidence that group differences in SAT-M measurement intercepts may partly explain the well-known finding of observed differences in prediction intercepts. Additionally, results provided evidence that nearly a quarter of the statistically significant observed intercept differences were not statistically significant at the 0.05 level once predictor measurement error was accounted for using the 2SLS procedure. Our joint measurement and prediction invariance approach based on latent scores opens the door to a new high-stakes testing research agenda whose goal is to not simply assess whether observed group-based differences exist and the size and direction of such differences. Rather, the goal of this research agenda is to assess the causal chain starting with underlying theoretical mechanisms (e.g., contextual factors, differences in latent predictor scores) that affect the size and direction of any observed differences.
机译:涉及人口统计组织考试成绩的预测系统差异的存在仍然是棘手和未解决的科学,专业和社会问题。我们的案例研究使用了两级最小二乘(2SLS)估计,共同评估了高赌注测试中的测量不变性和预测不变性。因此,我们在基于潜伏的基础上检查了跨群体的差异,而不是观察到学院董事会的176所高校数据的分数。结果表明,关于卫星数学(SAT-M)的课程拒绝了有关测量不变性的证据,以74.5%和29.9%的黑色与白色和西班牙裔与白色比较的队列的0.01级。此外,平均而言,具有相同常见因素的黑人学生已观察到SAT-M分数,该分数几乎比对于可比较的白人的标准偏差的三分之一。我们还发现,SAT-M测量截距的组差异可以部分解释众所周知的预测截距的观察差异。此外,结果提供了证据表明,一旦使用2SLS过程计算出预测测量误差,近四分之一的统计学显着的观察截距差异在0.05级上没有统计学意义。我们基于潜在分数的联合测量和预测不变方法将打开新的高赌注测试研究议程,其目标是不仅仅评估是否存在观察到的基于组的差异和这种差异的大小和方向。相反,这项研究议程的目标是评估影响任何影响任何观察到的差异的大小和方向的潜在理论机制(例如,上下文因素,潜在预测因子分数的差异)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号