首页> 外文会议>Second conference on machine translation >CHRF ++: words helping character n-grams
【24h】

CHRF ++: words helping character n-grams

机译:CHRF ++:帮助字符n-gram的单词

获取原文
获取原文并翻译 | 示例

摘要

Character n-gram F-score (CHRF) is shown to correlate very well with human relative rankings of different machine translation outputs, especially for morphologically rich target languages. However, its relation with direct human assessments is not yet clear. In this work, Pearson's correlation coefficients for direct assessments are investigated for two currently available target languages, English and Russian. First, different β parameters (in range from 1 to 3) are re-investigated with direct assessment, and it is confirmed that β = 2 is the optimal option. Then separate character and word n-grams are investigated, and the main finding is that, apart from character n-grams, word 1-grams and 2-grams also correlate rather well with direct assessments. Further experiments show that adding word unigrams and bi-grams to the standard CHRF score improves the correlations with direct assessments, though it is still not clear which option is better, unigrams only (CHRF+) or unigrams and bigrams (CHRF++). This should be investigated in future work on more target languages.
机译:字符n-gram F分数(CHRF)与不同机器翻译输出的人类相对排名非常相关,特别是对于形态丰富的目标语言。但是,它与人类直接评估的关系尚不清楚。在这项工作中,针对两种当前可用的目标语言(英语和俄语)研究了直接评估的皮尔逊相关系数。首先,通过直接评估重新研究了不同的β参数(范围从1到3),并且可以确定β= 2是最佳选择。然后研究了单独的字符和单词n-gram,主要发现是,除了字符n-gram之外,单词1-gram和2-gram也与直接评估有很好的关联。进一步的实验表明,在标准CHRF分数中添加单词字母组合词和字母组合词可以改善与直接评估的相关性,尽管仍不清楚哪个选项更好,仅字母组合词(CHRF +)或字母组合词和双字母词(CHRF ++)更好。在以后的更多目标语言工作中应对此进行调查。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号