首页> 外文会议>Annual meeting of the Association for Computational Linguistics;ACL 2011 >Creating a manually error-tagged and shallow-parsed learner corpus
【24h】

Creating a manually error-tagged and shallow-parsed learner corpus

机译:创建一个带有人工错误标签和浅层分析的学习者语料库

获取原文

摘要

The availability of learner corpora, especially those which have been manually error-tagged or shallow-parsed, is still limited. This means that researchers do not have a common development and test set for natural language processing of learner English such as for grammatical error detection. Given this background, we created a novel learner corpus that was manually error-tagged and shallow-parsed. This corpus is available for research and educational purposes on the web. In this paper, we describe it in detail together with its data-collection method and annotation schemes. Another contribution of this paper is that we take the first step toward evaluating the performance of existing POS-tagging/chunking techniques on learner corpora using the created corpus. These contributions will facilitate further research in related areas such as grammatical error detection and automated essay scoring.
机译:学习者语料库的可用性仍然受到限制,尤其是那些经过人工错误标记或浅层解析的学习者。这意味着研究人员对于学习者英语的自然语言处理(例如语法错误检测)没有共同的开发和测试集。在这种背景下,我们创建了一个新颖的学习者语料库,该语料库经过手动错误标记和浅层分析。该语料库可在网络上用于研究和教育目的。在本文中,我们将对其进行详细描述,以及其数据收集方法和注释方案。本文的另一个贡献是,我们迈出了第一步,即使用创建的语料库评估现有POS标记/分块技术对学习者语料库的性能。这些贡献将促进在相关领域的进一步研究,例如语法错误检测和自动作文评分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号