ERRANT: Assessing and Improving Grammatical Error Type Classification

机译：错误：评估和提高语法错误类型分类

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Grammatical Error Correction (GEC) is the task of correcting different types of errors in written texts. To manage this task, large amounts of annotated data that contain erroneous sentences are required. This data, however, is usually annotated according to each annotator's standards, making it difficult to manage multiple sets of data at the same time. The recently introduced Error Annotation Toolkit (ERRANT) tackled this problem by presenting a way to automatically annotate data that contain grammatical errors, while also providing a standardisation for annotation. errant extracts the errors and classifies them into error types, in the form of an edit that can be used in the creation of GEC systems, as well as for grammatical error analysis. However, we observe that certain errors are falsely or ambiguously classified. This could obstruct any qualitative or quantitative grammatical error type analysis, as the results would be inaccurate. In this work, we use a sample of the FCE coprus (Yannakoudakis et al., 2011) for secondary error type annotation and we show that up to 39% of the annotations of the most frequent type should be re-classified. Our corrections will be publicly released, so that they can serve as the starting point of a broader, collaborative, ongoing correction process.

机译：语法纠错（GEC）是在书面文本中纠正不同类型的错误的任务。要管理此任务，需要大量包含错误句子的注释数据。然而，此数据通常根据每个注释器的标准注释，这使得难以同时管理多组数据。最近引入的错误注释Toolkit（错误）通过呈现自动注释包含语法错误的数据来解决此问题，同时还提供注释的标准化。错误提取错误并将它们分类为错误类型，以可以在创建GEC系统中使用的编辑以及语法错误分析。但是，我们观察到某些错误被错误地或模棱两可分类。这可能会阻碍任何定性或定量的语法错误类型分析，因为结果是不准确的。在这项工作中，我们使用FCE Coprus的样本（Yannakoudakis等，2011）进行次要误差类型注释，我们表明最多39％的注释应该重新分类。我们的更正将公开发布，以便他们可以作为更广泛，协作，正在进行的更正过程的起点。

著录项

来源
《Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature》|2020年|85-89|共5页
会议地点
作者
Katerina Korre; John Pavlopoulos;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Improving the Control of Type I Error Rate in Assessing Differential Item Functioning for Hierarchical Generalized Linear Model When Impact Is Presented [J] . Jyun-Hong Chen, Cheng-Te Chen, Ching-Lin Shih Applied Psychological Measurement . 2014,第1期

机译：提出影响时，在分级广义线性模型的微分项功能评估中，改善I型错误率的控制
2. Beyond Power Calculations: Assessing Type S (Sign) and Type M (Magnitude) Errors [J] . Gelman Andrew, Carlin John Perspectives on Psychological Science . 2014,第6期

机译：功率计算之外：评估S型（正负号）和M型（幅值）误差
3. Taller Peaks: an improved spike detection algorithm that simultaneously reduces type I and type II errors for Wave_clus [J] . MURAT OKATAN Turkish Journal of Electrical Engineering and Computer Sciences . 2017,第4期

机译：高峰：改进的尖峰检测算法，可同时减少Wave_clus的I型和II型错误
4. Wronging a Right: Generating Better Errors to Improve Grammatical Error Detection [C] . Sudhanshu Kasewa, Pontus Stenetorp, Sebastian Riedel Conference on empirical methods in natural language processing . 2018

机译：错了权利：产生更好的错误以改善语法错误检测
5. Grammatical Error Types Using A Picture Description Task in English and Spanish with Young Children [D] . Valles, Lisa. 2019

机译：语法错误类型使用英语和西班牙语的图片描述任务与幼儿
6. Effects of Errors in Classification and Diagnosis in Various Types of Epidemiological Studies [O] . Earl L. Diamond, Abraham M. Lilienfeld 1962

机译：各种流行病学研究中分类和诊断错误的影响
7. Classification and generation of grammatical errors. [O] . Anthony Penniston 2021

机译：语法错误的分类和生成。

ERRANT: Assessing and Improving Grammatical Error Type Classification

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅