首页> 外文期刊>Entropy >Investigating the Relationship between Classification Quality and SMT Performance in Discriminative Reordering Models
【24h】

Investigating the Relationship between Classification Quality and SMT Performance in Discriminative Reordering Models

机译:在区分重排序模型中调查分类质量和SMT性能之间的关系

获取原文
           

摘要

Reordering is one of the most important factors affecting the quality of the output in statistical machine translation (SMT). A considerable number of approaches that proposed addressing the reordering problem are discriminative reordering models (DRM). The core component of the DRMs is a classifier which tries to predict the correct word order of the sentence. Unfortunately, the relationship between classification quality and ultimate SMT performance has not been investigated to date. Understanding this relationship will allow researchers to select the classifier that results in the best possible MT quality. It might be assumed that there is a monotonic relationship between classification quality and SMT performance, i.e., any improvement in classification performance will be monotonically reflected in overall SMT quality. In this paper, we experimentally show that this assumption does not always hold, i.e., an improvement in classification performance might actually degrade the quality of an SMT system, from the point of view of MT automatic evaluation metrics. However, we show that if the improvement in the classification performance is high enough, we can expect the SMT quality to improve as well. In addition to this, we show that there is a negative relationship between classification accuracy and SMT performance in imbalanced parallel corpora. For these types of corpora, we provide evidence that, for the evaluation of the classifier, macro-averaged metrics such as macro-averaged F-measure are better suited than accuracy, the metric commonly used to date.
机译:重新排序是影响统计机器翻译(SMT)中输出质量的最重要因素之一。提议解决重排序问题的许多方法是歧视性重排序模型(DRM)。 DRM的核心组件是分类器,它试图预测句子的正确单词顺序。不幸的是,迄今为止,尚未研究分类质量和最终SMT性能之间的关系。了解这种关系将使研究人员能够选择能够带来最佳MT质量的分类器。可以假定分类质量和SMT性能之间存在单调关系,即,分类性能的任何提高都将单调反映在总体SMT质量中。在本文中,我们通过实验证明了这种假设并不总是成立,即从MT自动评估指标的角度来看,分类性能的提高实际上可能会降低SMT系统的质量。但是,我们表明,如果分类性能的改善足够高,则可以期望SMT质量也将得到改善。除此之外,我们表明在不平衡并行语料库中分类准确性和SMT性能之间存在负相关关系。对于这些类型的语料库,我们提供的证据表明,对于分类器的评估,宏观平均的度量(例如,宏观平均的F度量)比准确度(迄今为止常用的度量)更适合。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号