首页> 外文会议>Annual Meeting of the Association for Computational Linguistics >Grammatical Error Correction Using Pseudo Learner Corpus Considering Error Tendency of Learners
【24h】

Grammatical Error Correction Using Pseudo Learner Corpus Considering Error Tendency of Learners

机译:考虑学习者错误倾向的伪学习者语料库语法纠错

获取原文

摘要

Recently, several studies have focused on improving the performance of grammatical error correction (GEC) tasks using pseudo data. However, a large amount of pseudo data are required to train an accurate GEC model. To address the limitations of language and computational resources, we assume that introducing pseudo errors into sentences similar to those written by the language learners is more efficient, rather than incorporating random pseudo errors into monolingual data. In this regard, we study the effect of pseudo data on GEC task performance using two approaches. First, we extract sentences that are similar to the learners' sentences from monolingual data. Second, we generate realistic pseudo errors by considering error types that learners often make. Based on our comparative results, we observe that F_(0.5) scores for the Russian GEC task are significantly improved.
机译:最近,一些研究集中于使用伪数据提高语法错误纠正(GEC)任务的性能。然而,为了训练一个精确的GEC模型,需要大量的伪数据。为了解决语言和计算资源的局限性,我们假设在类似于语言学习者所写句子的句子中引入伪错误比在单语数据中引入随机伪错误更有效。在这方面,我们使用两种方法研究伪数据对GEC任务性能的影响。首先,我们从单语数据中提取与学习者句子相似的句子。其次,我们通过考虑学习者经常犯的错误类型来生成真实的伪错误。根据我们的比较结果,我们观察到俄罗斯GEC任务的F_(0.5)分数显著提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号