首页> 外文会议>Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies >Pretrain-Finetune Based Training of Task-Oriented Dialogue Systems in a Real-World Setting
【24h】

Pretrain-Finetune Based Training of Task-Oriented Dialogue Systems in a Real-World Setting

机译:基于Pretrain-Finetune在真实的世界环境中基于面向任务对话系统的培训

获取原文

摘要

One main challenge in building task-oriented dialogue systems is the limited amount of supervised training data available. In this work, we present a method for training retrieval-based dialogue systems using a small amount of high-quality, annotated data and a larger, unlabeled dataset. We show that pretraining using unlabeled data can bring better model performance with a 31% boost in Recall® 1 compared with no pretraining. The proposed finetuning technique based on a small amount of high-quality, annotated data resulted in 26% offline and 33% online performance improvement in Recall® 1 over the pretrained model. The model is deployed in an agent-support application and evaluated on live customer service contacts, providing additional insights into the real-world implications compared with most other publications in the domain often using asynchronous transcripts (e.g. Reddit data). The high performance of 74% Recall® 1 shown in the customer service example demonstrates the effectiveness of this pretrain-finetunc approach in dealing with the limited supervised data challenge.
机译:建立面向任务对话系统的一个主要挑战是可用的监督培训数据量有限。在这项工作中,我们使用少量高质量,注释数据和更大的未标记数据集来介绍一种培训基于检索的对话系统的方法。我们表明使用未标记数据的预先预防可以带来更好的模型性能,而Recall®1的增强率为31%,而无需预先预先预订。基于少量高质量的注释数据的提议的FineTuning技术导致了26%的离线和预押模型中的召回®1的33%的在线性能改进。该模型部署在代理 - 支持应用程序中,并在实时客户服务联系人上进行评估,与通常使用异步成绩单的大多数其他出版物(例如Reddit数据)相比,对现实客户服务联系人进行了额外的洞察力。客户服务示例中显示的74%Recall®1的高性能展示了这种预rain-Finetunc方法在处理有限监督数据挑战方面的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号