首页> 外文学位 >An Application of Generalizability Theory on Writing Assessment: Effects of Marking Components Weighting.
【24h】

An Application of Generalizability Theory on Writing Assessment: Effects of Marking Components Weighting.

机译:概化理论在写作评估中的应用:标记成分权重的影响。

获取原文
获取原文并翻译 | 示例

摘要

In writing assessment, there are quite a number of factors influencing the marking stability and the reliability of the assessment such as the attitude towards marking and consistency of markers, the physical environment, the design of the items, and marking rubrics. Even the methods to train markers have effects on the reliability of the assessment. Generalizability Theory was used in this research to analyze the Chinese writing assessment of the Territory-wide System Assessment (TSA) so as to improve the reliability of the assessment.;TSA is a standardized test administered centrally by the Hong Kong Examinations and Assessment Authority every year. The target groups are students from Primary 3, Primary 6 and Secondary 3. TSA focuses on assessing students' basic competency on the three core subjects, Chinese Language, English Language and Mathematics. In contrast to the traditional Chinese writing assessment, there was no requirement on the minimum number of words produced by the student. An analytical approach was adopted to assess students' writing tasks. As a result of this measure, students who did well in some particular marking criteria would end up with a good overall performance.;This study was a post-mortem analysis of the raw scores from a sample of 6,000 students who participated in TSA 2006. As there were three sub-papers, the sample consisted of 2,000 students from each sub-paper. Brennan's GENOVA program (1983) was used to calculate the reliabilities of the assessments.;In the assessment, each student was marked by two raters, assigned at random to the student from a pool of 200 raters. These raters had undergone a series of instructional programs and training prior to the job. Each of the two raters gave seven scores to the script. As there was no minimum number of words as required in the writing assessment, a general belief would be generated that if there was insufficient content (as evidenced by a low score in "content") and poor organization (low score in "organization"), then the student would have written so few words that the chance of making mistakes in "vocabulary" (the 6th score) and in "punctuation" (the 7 th score) would be relatively small. In order to rectify the deficiency in marking, this study used three different methods to apply weights on the "vocabulary" score and on the "punctuation" score. For each method, the GENOVA program was used to calculate the reliability of the assessments. After due comparison, it was found that each of the methods used was able to raise the reliabilities of the assessments under investigation, and the most recommended method was to use students' scores in "content and in structure" as weights.;On the one hand, the study has examined the present mode of marking of the writing assessment in the TSA. This gives opportunity for improving the item-setting and the script-marking procedures of the assessment with a view to raising its reliability and giving valuable feedbacks to teaching and learning. On the other hand, the favourable results of applying weights to sub-scores will serve to provide a good example on improving marking rubrics in large-scale standardized tests of writing assessment in Chinese Language.
机译:在书面评估中,有很多因素会影响标记的稳定性和评估的可靠性,例如对标记的态度和标记的一致性,物理环境,项目的设计以及标记规则。甚至训练标记的方法也会影响评估的可靠性。本研究使用概化理论来分析全域系统评估(TSA)的中文写作评估,以提高评估的可靠性。; TSA是由香港考试和评估局集中管理的标准化考试年。目标群体是来自小学三年级,小学六年级和中学三年级的学生。TSA专注于评估学生在汉语,英语和数学这三个核心科目的基本能力。与传统的中文写作评估相反,学生对单词的最少数量没有要求。采用分析方法评估学生的写作任务。这项措施的结果是,在某些特定的评分标准上表现出色的学生最终将表现出良好的整体表现。这项研究是对参加TSA 2006的6,000名学生的原始成绩进行的事后分析。由于共有三份子论文,因此样本包括每份子论文中的2,000名学生。 Brennan的GENOVA程序(1983)用于计算评估的可靠性。在评估中,每个学生都由两个评估者标记,并从200个评估者中随机分配给该学生。这些评估者在工作之前接受了一系列教学计划和培训。两位评分者均给剧本七个分数。由于写作评估中没有最少的单词数,因此将产生一个普遍的信念,即如果内容不足(由“内容”得分低证明)和组织不佳(“组织”得分低)证明, ,那么学生将只写很少的单词,因此在“词汇”(第六得分)和“标点符号”(第七得分)中犯错的机会相对较小。为了纠正标记的不足,本研究使用三种不同的方法对“词汇”分数和“标点”分数应用权重。对于每种方法,均使用GENOVA程序来计算评估的可靠性。经过适当的比较,发现使用的每种方法都可以提高被调查评估的可靠性,最推荐的方法是使用学生在“内容和结构”中的得分作为权重。一方面,这项研究检查了TSA中写作评估标记的当前模式。这为改进评估的项目设置和脚本标记程序提供了机会,以提高评估的可靠性并为教与学提供有价值的反馈。另一方面,将权重应用于子分数的良好结果将为在中文写作评估的大规模标准化测试中改善评分标准提供一个很好的例子。

著录项

  • 作者

    Lam, Ling Chi Tenny.;

  • 作者单位

    The Chinese University of Hong Kong (Hong Kong).;

  • 授予单位 The Chinese University of Hong Kong (Hong Kong).;
  • 学科 Education Language and Literature.;Education Tests and Measurements.;Education Educational Psychology.;Psychology Psychometrics.
  • 学位 Ed.D.
  • 年度 2010
  • 页码 126 p.
  • 总页数 126
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号