首页> 外文会议>Conference on machine translation;Annual meeting of the Association for Computational Linguistics >Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low-Resource Conditions
【24h】

Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low-Resource Conditions

机译:WMT 2019共享资源不足资源并行语料库筛选任务的发现

获取原文

摘要

Following the WMT 2018 Shared Task on Parallel Corpus Filtering (Koehn et al., 2018), we posed the challenge of assigning sentence-level quality scores for very noisy corpora of sentence pairs crawled from the web, with the goal of sub-selecting 2% and 10% of the highest-quality data to be used to train machine translation systems. This year, the task tackled the low resource condition of Nepali-English and Sinhala-English. Eleven participants from companies, national research labs, and universities participated in this task.
机译:在WMT 2018并行语料库过滤共享任务(Koehn等人,2018)之后,我们提出了为从网络上爬网的非常嘈杂的句子对语料库分配句子级质量得分的挑战,目标是选择2用于训练机器翻译系统的最高质量数据的百分比和10%。今年,该任务解决了尼泊尔英语和僧伽罗英语资源不足的问题。来自公司,国家研究实验室和大学的11名参与者参加了此任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号