首页> 外文学位 >Predicting differential item functioning in cross-lingual testing: The case of a high stakes test in the Kyrgyz Republic.

【24h】

Predicting differential item functioning in cross-lingual testing: The case of a high stakes test in the Kyrgyz Republic.

机译：预测跨语言测试中的差异项目功能：吉尔吉斯共和国的高风险测试案例。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Cross-lingual tests are assessment instruments created in one language and adapted for use with another language group. Practitioners and researchers use cross-lingual tests for various descriptive, analytical and selection purposes both in comparative studies across nations and within countries marked by linguistic diversity (Hambleton, 2005). Due to cultural, contextual, psychological and linguistic differences between diverse populations, adapting test items for use across groups is a challenging endeavor. The validity of inferences based on cross-lingual tests can only be assured if the content, meaning, and difficulty of test items are similar in the different language versions of the test items (Ercikan, 2002).;Of paramount importance in the test adaptation process is the proven ability of test developers to adapt test items across groups in meaningful ways. One way investigators seek to understand the level of item equivalence on a cross-lingual assessment is to analyze items for differential item functioning, or DIF. DIF is present when examinees from different language groups do not have the same probability of responding correctly to a given item, after controlling for examinee ability (Camilli & Shephard, 1994). In order to detect and minimize DIF, test developers employ both statistical methods and substantive (judgmental) reviews of cross-lingual items. In the Kyrgyz Republic, item developers rely on substantive review of items by bi-lingual professionals. In situations where statistical DIF detection methods are not typically utilized, the accuracy of such professionals in discerning differences in content, meaning and difficulty between items is especially important.;In this study, the accuracy of bi-linguals' predictions about whether differences between Kyrgyz and Russian language test items would lead to DIF was evaluated. The items came from a cross-lingual university scholarship test in the Kyrgyz Republic. Evaluators' predictions were compared to a statistical test of "no difference" in response patterns by group using the logistic regression (LR) DIF detection method (Swaminathan & Rogers, 1990). A small number of test items were estimated to have "practical statistical DIF." There was a modest, positive correlation between evaluators' predictions and statistical DIF levels. However, with the exception of one item type, sentence completion, evaluators were unable to predict which language group was favored by differences on a consistent basis. Plausible explanations for this finding as well as ways to improve the accuracy of substantive review are offered.;Data was also collected to determine the primary sources of DIF in order to inform the test development and adaptation process in the republic. Most of the causes of DIF were attributed to highly contextual (within item) sources of difference related to overt adaptation problems. However, inherent language differences were also noted: Syntax issues with the sentence completion items made the adaptation of this item type from Russian into Kyrgyz problematic. Statistical and substantive data indicated that the reading comprehension items were less problematic to adapt than analogy and sentence completion items. I analyze these findings and interpret their implications to key stakeholders, provide recommendations for how to improve the process of adapting items from Russian into Kyrgyz and highlight cautions to interpreting the data collected in this study.

机译：跨语言测试是一种用一种语言创建并适合与另一种语言一起使用的评估工具。从业人员和研究人员在跨国比较研究中以及在具有语言多样性特征的国家内部，都使用跨语言测试来进行各种描述性，分析性和选择性目的（Hambleton，2005年）。由于不同人群之间在文化，语境，心理和语言上的差异，使测试项目适合不同人群使用是一项具有挑战性的工作。仅当在不同语言版本的测试项目中测试项目的内容，含义和难度相似时，才能确保基于跨语言测试的推理的有效性（Ercikan，2002）。过程是测试开发人员以有意义的方式跨组调整测试项目的公认能力。调查人员试图了解跨语言评估中项目等效性的一种方法是分析项目以区分项目功能或DIF。在控制了应试者的能力之后，如果来自不同语言组的应试者没有相同的概率正确回答给定项目，则存在DIF（Camilli＆Shephard，1994）。为了检测和最小化DIF，测试开发人员同时使用统计方法和跨语言项目的实质性（判断性）审查。在吉尔吉斯共和国，项目开发人员依靠双语专业人员对项目进行实质性审查。在通常不使用统计DIF检测方法的情况下，此类专业人员分辨项目之间的内容，含义和难度差异的准确性尤为重要。并且俄语测试项目会导致DIF被评估。这些项目来自吉尔吉斯共和国的跨语言大学奖学金考试。使用逻辑回归（LR）DIF检测方法，将评估者的预测与响应模式的“无差异”统计测试进行了比较（Swaminathan＆Rogers，1990）。少数测试项目估计具有“实用统计DIF”。评估者的预测与统计DIF水平之间存在适度的正相关。但是，除了一个项目类型（句子完成）外，评估者无法一致地预测哪个语言组会受到差异的青睐。提供了对此发现的合理解释，以及提高实质审查的准确性的方法。；还收集了数据以确定DIF的主要来源，以便为共和国的测试开发和适应过程提供信息。 DIF的大多数原因都归因于与公开适应问题相关的高度上下文差异（项目内）。但是，还注意到了固有的语言差异：句子补全项目的语法问题使该项目类型从俄语适应吉尔吉斯语成为问题。统计和实质性数据表明，与类推和句子完成项目相比，阅读理解项目的适应问题较少。我分析了这些发现并解释了它们对关键利益相关者的影响，为如何改进从俄文到吉尔吉斯斯坦的项目改编提供了建议，并强调了解释本研究中收集的数据时应注意的注意事项。

著录项

作者
Drummond, Todd W.;
展开▼
作者单位

Michigan State University.;

展开▼
授予单位 Michigan State University.;
学科 Education Tests and Measurements.;Education Policy.;Slavic Studies.
学位 Ph.D.
年度 2011
页码 314 p.
总页数 314
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Multiple-Group Noncompensatory Differential Item Functioning in Raju's Differential Functioning of Items and Tests [J] . T. C. Oshima, Keith Wright, Nick White International Journal of Testing: Official Journal of the International Test Commission . 2015,第3期

机译：Raju项和检验的差分函数中的多组非补偿性差分项函数
2. Using Confidence Intervals of the Item and Test Information Functions to Test Differential Item and Test Functioning: Visual and Statistical Analyses [J] . Georgios D. Sideridis, Ioannis Tsaousis, Khaleel Al. Harbi Journal of applied measurement . 2019,第3期

机译：使用项目的置信区间和测试信息函数来测试差分项目和测试功能：视觉和统计分析
3. Testing item response theory invariance of the standardized Quality-of-life Disease Impact Scale (QDIS(A (R))) in acute coronary syndrome patients: differential functioning of items and test [J] . Deng Nina, Anatchkova Milena D., Waring Molly E., Quality of life research: An international journal of quality of life aspects of treatment, care and rehabilitation . 2015,第8期

机译：急性冠脉综合征患者的标准生活质量疾病影响量表（QDIS（A（R）））的测试项目反应理论不变性：项目和测试的差异
4. Item Differential in Computer Based and Paper Based Versions of a High Stakes Tertiary Entrance Test: Diagrams and the Problem of Annotation [C] . Brad Jackel International conference on diagrammatic representation and inference . 2014

机译：高风险第三次入学考试的计算机版本和纸质版本的项目差异：图表和注释问题
5. Evaluation of a modified Item Parameter Replication method for Differential Functioning of Items and Tests analysis with unequal sample sizes. [D] . Blitz, David L. 2016

机译：评估修改后的项目参数复制方法，以实现不等样本量的项目和测试分析的差异功能。
6. Measurement Invariance and Differential Item Functioning Across Gender Within a Latent Class Analysis Framework: Evidence From a High-Stakes Test for University Admission in Saudi Arabia [O] . Ioannis Tsaousis, Georgios D. Sideridis, Hanan M. AlGhamdi 2020

机译：潜在类别分析框架内跨性别的测量不变性和差异项功能：来自沙特阿拉伯大学入学率高分测试的证据
7. Measurement Invariance and Differential Item Functioning Across Gender Within a Latent Class Analysis Framework: Evidence From a High-Stakes Test for University Admission in Saudi Arabia [O] . Ioannis Tsaousis, Georgios D. Sideridis, Hanan M. AlGhamdi 2020

机译：在潜在阶级分析框架内的性别中的测量不变性和差异项目在潜在的分析框架内：来自沙特阿拉伯大学入学大学入学的证据
8. Exploring Differential Item Functioning on Science Achievement Tests Technical rept [R] . Hamilton, L. S. 1998

机译：探索差异项目在科学成就测试中的作用技术文献

Predicting differential item functioning in cross-lingual testing: The case of a high stakes test in the Kyrgyz Republic.

摘要

著录项

相似文献

相关主题

期刊订阅