...
首页> 外文期刊>Journal of Computer and Communications >Intrinsic and Extrinsic Automatic Evaluation Strategies for Paraphrase Generation Systems
【24h】

Intrinsic and Extrinsic Automatic Evaluation Strategies for Paraphrase Generation Systems

机译:释义生成系统的内在和外在的自动评估策略

获取原文
           

摘要

Paraphrase is an expression of a text with alternative words and orders to achieve a better clarity. Paraphrases have been found vital for augmenting training dataset, which aid to enhance performance of machine learning models that intended for various natural language processing (NLP) tasks. Thus, recently, automatic paraphrase generation has received increasing attention. However, evaluating quality of generated paraphrases is technically challenging. In the literature, the importance of generated paraphrases is tended to be determined by their impact on the performance of other NLP tasks. This kind of evaluation is referred as extrinsic evaluation, which requires high computational resources to train and test the models. So far, very little attention has been paid to the role of intrinsic evaluation in which quality of generated paraphrase is judged against predefined ground truth (reference paraphrases). In fact, it is also very challenging to find ideal and complete reference paraphrases. Therefore, in this study, we propose semantic or meaning oriented automatic evaluation metric that helps to evaluate quality of generated paraphrases against the original text, which is an intrinsic evaluation approach. Further, we evaluate quality of the paraphrases by assessing their impact on other NLP tasks, which is an extrinsic evaluation method. The goal is to explore the relationship between intrinsic and extrinsic evaluation methods. To ensure the effectiveness of proposed evaluation methods, extensive experiments are done on different publicly available datasets. The experimental results demonstrate that our proposed intrinsic and extrinsic evaluation strategies are promising. The results further reveal that there is a significant correlation between intrinsic and extrinsic evaluation approaches.
机译:释义是文本的表达,具有替代单词和订单,以实现更好的清晰度。对于增强培训数据集来说,释义已经为增强培训,这有助于提高用于各种自然语言处理(NLP)任务的机器学习模型的性能。因此,最近,自动解释产生的产生越来越受到关注。然而,评估生成的释义的质量在技术上是具有挑战性的。在文献中,所生成的释义的重要性往往通过影响其他NLP任务的性能的影响来确定。这种评估称为外在评估,这需要高计算资源来培训和测试模型。到目前为止,已经很少关注内在评估的作用,其中产生的产生解释质量反对预定义的地面真理(参考文献)。事实上,寻找理想和完整的参考文献措辞也非常具有挑战性。因此,在本研究中,我们提出了面向语义或意义的自动评估度量,有助于评估对原文的生成释义的质量,这是一种内在的评估方法。此外,我们通过评估它们对其他NLP任务的影响来评估释义的质量,这是一种外在评估方法。目标是探讨内在和外在评估方法之间的关系。为确保所提出的评估方法的有效性,在不同的公共可用数据集中进行了广泛的实验。实验结果表明,我们提出的内在和外在评估策略是有前途的。结果进一步揭示了内在和外在评估方法之间存在显着的相关性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号