首页> 外文学位 >An empirical examination of the impact of item parameters on IRT information functions in mixed format tests.
【24h】

An empirical examination of the impact of item parameters on IRT information functions in mixed format tests.

机译:对混合格式测试中项目参数对IRT信息功能的影响进行的经验检验。

获取原文
获取原文并翻译 | 示例

摘要

IRT, also referred as "modern test theory", offers many advantages over CTT-based methods in test development. Specifically, an IRT information function has the capability to build a test that has the desired precision of measurement for any defined proficiency scale when a sufficient number of test items are available. This feature is extremely useful when the information is used for decision making, for instance, whether an examinee attain certain mastery level. Computerized adaptive testing (CAT) is one of the many examples using IRT information functions in test construction.;The purposes of this study were as follows: (1) to examine the consequences of improving the test quality through the addition of more discriminating items with different item formats; (2) to examine the effect of having a test where its difficulty does not align with the ability level of the intended population; (3) to investigate the change in decision consistency and decision accuracy; and (4) to understand changes in expected information when test quality is either improved or degraded, using both empirical and simulated data.;Main findings from the study were as follows: (1) increasing the discriminating power of any types of items generally increased the level of information; however, sometimes it could bring adverse effect to the extreme ends of the ability continuum; (2) it was important to have more items that were targeted at the population of interest, otherwise, no matter how good the quality of the items may be, they were of less value in test development when they were not targeted to the distribution of candidate ability or at the cutscores; (3) decision consistency (DC), Kappa statistic, and decision accuracy (DA) increased with better quality items; (4) DC and Kappa were negatively affected when difficulty of the test did not match with the ability of the intended population; however, the effect was less severe if the test was easier than needed; (5) tests with more better quality items lowered false positive (FP) and false negative (FN) rate at the cutscores; (6) when test difficulty did not match with the ability of the target examinees, in general, both FP and FN rates increased; (7) polytomous items tended to yield more information than dichotomously scored items, regardless of the discriminating parameter and difficulty of the item; and (8) the more score categories an item had, the more information it could provide.;Findings from this thesis should help testing agencies and practitioners to have better understanding of the item parameters on item and test information functions. This understanding is crucial for the improvement of the item bank quality and ultimately on how to build better tests that could provide more accurate proficiency classifications. However, at the same time, item writers should be conscientious about the fact that the item information function is merely a statistical tool for building a good test, other criteria should also be considered, for example, content balancing and content validity.
机译:IRT,也称为“现代测试理论”,在测试开发中比基于CTT的方法具有许多优势。具体来说,IRT信息功能具有在足够数量的测试项目可用时针对任何定义的熟练等级构建具有所需测量精度的测试的能力。当信息用于决策时(例如,考生是否达到一定的掌握水平),此功能非常有用。计算机自适应测试(CAT)是在测试构造中使用IRT信息功能的众多示例之一。这项研究的目的如下:(1)通过添加更多区分项来检查提高测试质量的后果。不同的项目格式; (2)在难度不符合预期人口能力水平的情况下,进行测试的效果; (3)调查决策一致性和决策准确性的变化; (4)使用经验数据和模拟数据来理解当测试质量提高或降低时预期信息的变化。研究的主要发现如下:(1)增强通常提高的任何类型物品的辨别力信息水平;但是,有时可能会对能力连续性的极端产生不利影响; (2)有更多针对目标人群的项目很重要,否则,无论这些项目的质量如何,如果不针对目标人群的分布,它们在测试开发中的价值就较小。候选人能力或得分榜; (3)随着质量的提高,决策一致性(DC),Kappa统计信息和决策准确性(DA)提高; (4)当测试难度与目标人群的能力不匹配时,DC和Kappa受到负面影响;但是,如果测试比需要的容易,则效果不那么严重。 (5)使用质量更高的项目进行测试可以降低得分的假阳性(FP)和假阴性(FN)率; (6)当考试难度与目标考生的能力不匹配时,FP和FN率总体上都增加了; (7)无论项目的区分参数和难易程度如何,多项目项比二等分项目往往产生更多的信息; (8)一个项目拥有的得分类别越多,它所能提供的信息就越多。论文的发现应有助于测试机构和从业人员更好地理解项目参数以及测试信息功能。这种理解对于提高物料库质量至关重要,并最终对如何建立更好的测试以提供更准确的能力等级至关重要。但是,与此同时,项目编写者应对项目信息功能仅是用于构建良好测试的统计工具这一事实保持谨慎,还应考虑其他标准,例如,内容平衡和内容有效性。

著录项

  • 作者

    Lam, Wai Yan Wendy.;

  • 作者单位

    University of Massachusetts Amherst.;

  • 授予单位 University of Massachusetts Amherst.;
  • 学科 Education Tests and Measurements.
  • 学位 Ed.D.
  • 年度 2012
  • 页码 198 p.
  • 总页数 198
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号