首页> 外文学位 >Effects of scoring method and rater experience on ESL essay rating processes and outcomes.
【24h】

Effects of scoring method and rater experience on ESL essay rating processes and outcomes.

机译:评分方法和评分者经验对ESL论文评分过程和结果的影响。

获取原文
获取原文并翻译 | 示例

摘要

This study examined the effects of scoring method and rater experience on ESL essay rater performance. Each of 31 novice and 29 experienced raters rated 24 essays using a holistic and a multiple-trait scale. Interviews and think-aloud protocols provided data about the participants' decision-making behaviors and the aspects of writing they attended to. Essay scores were analyzed to estimate rater severity and self-consistency and the relationships between the multiple-trait and holistic scores.;Novices exhibited greater intra- and inter-rater variability and tended to refer more frequently to the rating scale, to focus on local aspects of writing more often, and to spend more time interpreting and/or editing text than the experienced raters did. Experienced raters tended to refer more frequently to other criteria than those in the rubrics, to report more judgment strategies and rhetorical and ideational focus, to spend more time reading and assessing the essays overall, and to be more efficient, confident, self-consistent, and homogeneous in their ratings than were the novices.;Scoring methods seem to have a greater effect on the severity of experienced raters and the self-consistency of novices. In addition, multiple-trait scoring focused the novices' attention on the criteria in the scale and led them to organize these criteria coherently and to employ more judgment strategies, thus making the rating task manageable and improving their self-consistency. The effects of scoring methods on experienced raters' performance were less pronounced.;Overall, these findings suggest that multiple-trait scoring is most appropriate for assessing L2 writing. However, the two scoring methods might be useful for different assessment purposes, contexts, raters, and examinee populations. The thesis also has implications for test validation research.;The findings indicated that both scoring methods measured the same construct, but the multiple-trait method allowed finer distinctions among examinees in terms of writing ability. Holistic scoring resulted in higher inter-rater reliability, while multiple-trait scoring led to higher rater self-consistency, particularly for novices. Multiple-trait scoring prompted more judgment and self-monitoring strategies, while holistic scoring elicited more interpretation strategies and language focus. Furthermore, multiple-trait scoring reduced the complexity of the rating task and prompted raters to attend to all rating criteria in the scale.
机译:这项研究检查了评分方法和评估者经验对ESL论文评估者绩效的影响。 31名新手和29名经验丰富的评分员分别使用整体和多特征量表对24篇论文进行评分。访谈和思考方式协议提供了有关参与者决策行为以及参与者所从事写作方面的数据。分析作文评分以评估评分者的严重程度和自我一致性以及多特征和整体评分之间的关​​系。;新手表现出较高的评分者内部和评分者间变异性,并且倾向于更频繁地参考评分量表,重点关注本地与经验丰富的评分者相比,写作方面的问题会更频繁,并且花费更多的时间来解释和/或编辑文本。有经验的评分者倾向于参考其他准则,而不是那些准则,以报告更多的判断策略以及修辞和观念重点,花费更多的时间阅读和评估论文,并更加有效,自信,自洽,评分方法似乎对经验丰富的评分者的严重程度和新手的自我一致性影响更大。此外,多特征评分将新手的注意力集中在量表的标准上,使他们能够连贯地组织这些标准并采用更多的判断策略,从而使评分任务易于管理并提高了他们的自洽性。评分方法对经验丰富的评分者的表现的影响不太明显。总体而言,这些发现表明,多特征评分最适合评估L2写作。但是,这两种评分方法可能对不同的评估目的,背景,评分者和考生人群有用。研究结果表明,两种计分方法都测量相同的结构,但多特征方法允许考生在写作能力上有更好的区分。整体评分导致评分者之间的可靠性更高,而多特征评分导致评分者自我一致性更高,尤其是对于新手。多特征评分促使更多的判断力和自我监控策略,而整体评分则引发更多的解释策略和语言重点。此外,多特征评分降低了评分任务的复杂性,并促使评分者注意量表中的所有评分标准。

著录项

  • 作者

    Barkaoui, Khaled.;

  • 作者单位

    University of Toronto (Canada).;

  • 授予单位 University of Toronto (Canada).;
  • 学科 Education Language and Literature.;Education Bilingual and Multicultural.;Education Tests and Measurements.
  • 学位 Ph.D.
  • 年度 2008
  • 页码 319 p.
  • 总页数 319
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号