首页> 外文学位 >A comparison of item selection procedures using different ability estimation methods in computerized adaptive testing based on the generalized partial credit model.
【24h】

A comparison of item selection procedures using different ability estimation methods in computerized adaptive testing based on the generalized partial credit model.

机译:基于广义部分信用模型的计算机自适应测试中使用不同能力估计方法的项目选择过程的比较。

获取原文
获取原文并翻译 | 示例

摘要

Computerized adaptive testing (CAT) provides a highly efficient alternative to the paper-and-pencil test. By selecting items that match examinees' ability levels, CAT not only can shorten test length and administration time but it can also increase measurement precision and reduce measurement error.;In CAT, maximum information (MI) is the most widely used item selection procedure. However, the major challenge with MI is the attenuation paradox, which results because the MI algorithm may lead to the selection of items that are not well targeted at an examinee's true ability level, resulting in more errors in subsequent ability estimates. The solution is to find an alternative item selection procedure or an appropriate ability estimation method. CAT studies have not investigated the association between these two components of a CAT system based on polytomous IRT models.;The present study compared the performance of four item selection procedures (MI, MPWI, MEI, and MEPV) across four ability estimation methods (MLE, WLE, EAP-N, and EAP-PS) under the mixed-format CAT based on the generalized partial credit model (GPCM). The test-unit pool and generated responses were based on test-units calibrated from an operational national test that included both independent dichotomous items and testlets. Several test conditions were manipulated: the unconstrained CAT as well as the constrained CAT in which the CCAT was used as the content-balancing, and the progressive-restricted procedure with maximum exposure rate equal to 0.19 (PR19) served as the exposure control in this study. The performance of various CAT conditions was evaluated in terms of measurement precision, exposure control properties, and the extent of selected-test-unit overlap.;Results suggested that all item selection procedures, regardless of ability estimation methods, performed equally well in all evaluation indices across two CAT conditions. The MEPV procedure, however, was favorable in terms of a slightly lower maximum exposure rate, better pool utilization, and reduced test and selected-test-unit overlap than with the other three item selection procedures when both CCAT and PR19 procedures were implemented. It is not necessary to implement the sophisticated and computing-intensive Bayesian item selection procedures across ability estimation methods under the GPCM-based CAT.;In terms of the ability estimation methods, MLE, WLE, and two EAP methods, regardless of item selection procedures, did not produce practical differences in all evaluation indices across two CAT conditions. The WLE method, however, generated significantly fewer non-convergent cases than did the MLE method. It was concluded that the WLE method, instead of MLE, should be considered, because the non-convergent case is less of an issue. The EAP estimation method, on the other hand, should be used with caution unless an appropriate prior theta distribution is specified.
机译:计算机自适应测试(CAT)提供了纸笔测试的高效替代方案。通过选择与应试者能力水平相匹配的项目,CAT不仅可以缩短考试时间和缩短管理时间,而且还可以提高测量精度并减少测量误差。在CAT中,最大信息量(MI)是使用最广泛的项目选择程序。但是,MI的主要挑战是衰减悖论,这是因为MI算法可能导致选择对象的能力不强于应试者的真实能力水平,从而导致后续能力估计中出现更多错误。解决方案是找到替代的项目选择程序或适当的能力估计方法。 CAT研究尚未研究基于多态IRT模型的CAT系统的这两个组件之间的关联。本研究比较了四种能力估计方法(MLE)中四种项目选择程序(MI,MPWI,MEI和MEPV)的性能,WLE,EAP-N和EAP-PS)基于通用部分信用模型(GPCM)的混合格式CAT。测试单元库和生成的响应均基于从一项可操作的国家测试中校准的测试单元,该测试包括独立的二分项目和睾丸。操纵了几个测试条件:无约束CAT以及以CCAT作为内容平衡的约束CAT,最大暴露率等于0.19(PR19)的渐进限制程序用作暴露控制。研究。根据测量精度,曝光控制特性和所选测试单元重叠的程度对各种CAT条件的性能进行了评估;结果表明,无论能力评估方法如何,所有项目选择程序在所有评估中均表现良好跨两个CAT条件的索引。然而,与同时实施CCAT和PR19程序的其他三个项目选择程序相比,MEPV程序在最大降低率,更好的库利用率以及减少的测试和选定的测试单元重叠方面是有利的。在基于GPCM的CAT下,无需跨能力估计方法实施复杂且计算密集的贝叶斯项目选择过程;就能力估计方法而言,无论项目选择过程如何,MLE,WLE和两种EAP方法,没有在两个CAT条件下的所有评估指标中产生实际差异。但是,WLE方法产生的非收敛案例比MLE方法少得多。得出的结论是,应考虑使用WLE方法而不是MLE,因为非收敛情况的问题不大。另一方面,除非指定了适当的先验theta分布,否则应谨慎使用EAP估计方法。

著录项

  • 作者

    Ho, Tsung-Han.;

  • 作者单位

    The University of Texas at Austin.;

  • 授予单位 The University of Texas at Austin.;
  • 学科 Education Tests and Measurements.;Education Educational Psychology.;Education Technology of.
  • 学位 Ph.D.
  • 年度 2010
  • 页码 197 p.
  • 总页数 197
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-17 11:37:24

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号