Continuous Model Improvement for Language Understanding with Machine Translation

机译：用机器翻译的语言理解连续模型改进

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Scaling conversational personal assistants to a multitude of languages puts high demands on collecting and labelling data, a setting in which cross-lingual learning techniques can help to reconcile the need for well-performing natural language understanding (NLU) with a desideratum to support many languages without incurring unacceptable cost. In this paper, we show that automatically annotating unlabeled utterances using machine translation in an offline fashion and adding them to the training data can improve performance for existing NLU features for low-resource languages, where a straightforward translate-test approach as considered in existing literature would fail the latency requirements of a live environment. We demonstrate the effectiveness of our method with intrinsic and extrinsic evaluation using a real-world commercial dialog system in German. We show that 56% of the resulting automatically labeled utterances had a perfect match with ground-truth labels. Moreover, we see significant performance improvements in an extrinsic evaluation settings when manually labeled data is available in small quantities.

机译：缩放对话私人助理到多种语言提出了对收集和标记数据的高要求，其中交叉语言学习技术可以帮助协调对更良好的自然语言理解（NLU）的需求，以支持许多语言没有产生不可接受的成本。在本文中，我们展示了在离线时尚中使用机器翻译自动注释未标记的话语，并将其添加到培训数据可以提高现有NLU功能的性能，以便在现有文献中考虑的简单转化测试方法将失败实现现场环境的延迟要求。我们展示了我们在德语中使用真实世界商业对话系统的内在和外在评估的方法的有效性。我们显示56％的由此产生的备受标记的话语与地面标签完美匹配。此外，我们在手动标记的数据以少量提供时，我们会看到外在评估设置的显着性能改进。

著录项

来源
《Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies》|2021年|56-62|共7页
会议地点
作者
Abdalghani Abujabal; Claudio Delli Bovi; Sungho Ryu; Turan Gojayev; Yannick Versley; Fabian Triefenbach;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Converting Continuous-Space Language Models into N-gram Language Models with Efficient Bilingual Pruning for Statistical Machine Translation [J] . RUI WANG, MASAO UTIYAMA, ISAO GOTO, ACM transactions on Asian language information processing . 2016,第3期

机译：通过高效的双语修剪将连续空间语言模型转换为N-gram语言模型以进行统计机器翻译
2. Language Model Adaptation Using Machine-Translated Text for Resource-Deficient Languages [J] . ArnarThor Jensson, Koji Iwano, Sadaoki Furui EURASIP journal on audio, speech, and music processing . 2009,第1期

机译：使用机器翻译的文本对资源不足的语言进行语言模型自适应
3. Machine Translation Systems for Indian Languages: Review of Modelling Techniques, Challenges, Open Issues and Future Research Directions [J] . Singh Muskaan, Kumar Ravinder, Chana Inderveer Archives of Computational Methods in Engineering . 2021,第4期

机译：印度语言的机器翻译系统：审查建模技术，挑战，公开问题和未来的研究方向
4. Cross-Lingual Spoken Language Understanding from Unaligned Data using Discriminative Classification Models and Machine Translation [C] . Fabrice Lefevre, Frangois Mairesse, Steve Young Annual conference of the International Speech Communication Association;INTERSPEECH 2010 . 2011

机译：使用区分性分类模型和机器翻译对未对齐数据进行跨语言口语理解
5. A Crowd-Powered Conversational Assistant for the Improvement of a Neural Machine Translation System in Native Peruvian Language [D] . Gómez Montoya, Héctor Erasmo. 2019

机译：一种人群，用于改进本土秘密语言中神经机翻译系统的人群对话助理
6. Dataset for comparable evaluation of machine translation between 11 South African languages [O] . Cindy A. McKellar, Martin J. Puttkammer 2020

机译：用于11种南非语言之间的机器翻译可比性评估的数据集
7. Investigating continuous space language models for machine translation quality estimation [O] . Shah K., Ng R.W.M., Bougares F., 2015

机译：研究用于机器翻译质量评估的连续空间语言模型

Continuous Model Improvement for Language Understanding with Machine Translation

摘要

著录项

相似文献

相关主题

期刊订阅