Critical Evaluation of the 'Matura' Test: CEFR Alignment Project for the Austrian National Examination in English (B2 Level)

Minzi Li; Yongqiang Zeng; Ligerui Chen

摘要

This study attempted to evaluate the usefulness a CEFR (Common European Framework of Reference) alignment project ‘Matura’ listening test with respects of the item analysis, validity (i.e., content validity) and reliability (i.e., internal consistency reliability, scoring reliably). 93 students randomly selected from six secondary-school of different school-levels across three different regions in Austria completed the listening test. SPSS was employed for the statistical analysis. Facility value, discrimination index together with other descriptive statistics were reported for the item analysis. Content validity by a panel of expert judgments as well as the reliability of the listening test was further examined. Findings showed that: 1) the test paper was of average to high difficulty with its peak score locating around the mid-point, which allow higher education institutions to set an appropriate cut-score for decision making. Relatively widespread test scores further indicated its potential to efficiently discriminate learners of varied listening proficiency; 2) the majority of items in the listening test performed well and were qualitatively reasonably good. However, we did identify several problematic items with possible causes related to construct-irrelevance, too many possible answers, mismatch sign-posted words, fast audio speed, testing simply the background knowledge, and the heavy cognitive load required for “Not Given” option; 3) regarding validity and validity issues, construct-under representation and the less authentic listening materials were then spotted by careful content analysis. Though the statistical output reported relatively high reliability, results should be interpreted with caution. Based on findings, general evaluation of the test as well as implications for further improvement were revealed.

机译：本研究试图评估CEFR（常见欧洲参考框架）对准项目'Matura的聆听试验的有用性，有效性（即内容有效性）和可靠性（即内部一致性可靠性，得分可靠地得分）。 93名学生在奥地利三个不同地区的不同学校的六所不同学校的学校完成了听力测试。使用SPSS用于统计分析。设施价值，鉴别指数与其他描述性统计数据一起进行了项目分析。进一步检查了专家判决小组的内容有效性以及听力测试的可靠性。结果表明：1）测试纸的平均难以高难度，其峰值分数定位在中点，允许高等教育机构设定决策的适当切割得分。相对广泛的测试分数进一步表明它有效地歧视了更多的听力熟练程度的学习者; 2）听力测试中的大多数物品表现良好，具有定性合理的好处。但是，我们确实确定了几个有问题的项目，可能导致与构造 - 无关，太多可能的答案，不匹配签名的单词，快速音频速度，测试只是背景知识，以及“未给出”选项所需的沉重认知负载; 3）关于有效性和有效性问题，通过仔细的内容分析发现了构造 - 根据代表性和不太真实的听力材料。虽然统计输出报告了相对高的可靠性，但结果应谨慎解释。基于调查结果，揭示了对测试的一般评价以及对进一步改善的影响。

Critical Evaluation of the 'Matura' Test: CEFR Alignment Project for the Austrian National Examination in English (B2 Level)

摘要

著录项

相关主题

期刊订阅