首页> 外文会议>International conference on computational linguistics >Predicting proficiency levels in learner writings by transferring a linguistic complexity model from expert-written coursebooks
【24h】

Predicting proficiency levels in learner writings by transferring a linguistic complexity model from expert-written coursebooks

机译:通过从专家编写的课程手册中转移语言复杂性模型来预测学习者写作中的熟练程度

获取原文

摘要

The lack of a sufficient amount of data tailored for a task is a well-recognized problem for many statistical NLP methods. In this paper, we explore whether data sparsity can be successfully tackled when classifying language proficiency levels in the domain of learner-written output texts. We aim at overcoming data sparsity by incorporating knowledge in the trained model from another domain consisting of input texts written by teaching professionals for learners. We compare different domain adaptation techniques and find that a weighted combination of the two types of data performs best, which can even rival systems based on considerably larger amounts of in-domain data. Moreover, we show that normalizing errors in learners' texts can substantially improve classification when in-domain data with annotated proficiency levels is not available.
机译:对于许多统计NLP方法而言,缺少为任务量身定制的足够数据量是一个公认的问题。在本文中,我们探讨了在学习者编写的输出文本领域对语言熟练程度进行分类时,是否可以成功解决数据稀疏性问题。我们的目标是通过将知识融合到另一个领域中的知识稀疏性中,该知识来自另一个领域,该领域由教学专业人员为学习者编写的输入文本组成。我们比较了不同的域自适应技术,发现两种类型数据的加权组合效果最佳,甚至可以与基于大量域内数据的系统相媲美。此外,我们显示出,当没有带注释水平的域内数据可用时,对学习者文本进行错误归一化可以大大改善分类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号