首页> 外文会议>International Conference on Computational Linguistics >Federated Learning for Spoken Language Understanding
【24h】

Federated Learning for Spoken Language Understanding

机译:联合学习语言理解

获取原文

摘要

Recently, spoken language understanding (SLU) has attracted extensive research interests, and various SLU datasets have been proposed to promote the development. However, most of the existing methods focus on a single individual dataset, the efforts to improve the robustness of models and obtain better performance by combining the merits of various datasets are not well studied. In this paper, we argue that if these SLU datasets are considered together, different knowledge from different datasets could be learned jointly, and there are high chances to promote the performance of each dataset. At the same time, we further attempt to prevent data leakage when unifying multiple datasets which, arguably, is more useful in an industry setting. To this end, we propose a federated learning framework, which could unify various types of datasets as well as tasks to learn and fuse various types of knowledge, i.e., text representations, from different datasets and tasks, without the sharing of downstream task data. The fused text representations merge useful features from different SLU datasets and tasks and are thus much more powerful than the original text representations alone in individual tasks. At last, in order to provide multi-granularity text representations for our framework, we propose a novel Multi-view Encoder (MV-Encoder) as the backbone of our federated learning framework. Experiments on two SLU benchmark datasels, including two tasks (intention detection and slot filling) and federated learning settings (horizontal federated learning, vertical federated learning and federated transfer learning), demonstrate the effectiveness and universality of our approach. Specifically, we are able to get 1.53% improvement on the intent detection metric accuracy. And we could also boost the performance of a strong baseline by up to 5.29% on the slot filling metric F1. Furthermore, by leveraging BERT as an additional encoder, we establish new state-of-the-art results on SNIPS and ATIS datasets, where we get 99.33% and 98.28% in terms of accuracy on intent detection task as well as 97.20% and 96.41% in terms of Fl score on slot filling task, respectively.
机译:最近,口语语言理解(SLU)吸引了广泛的研究兴趣,并提出了各种SLU数据集以促进发展。然而,大多数现有方法都侧重于单个数据集,通过组合各个数据集的优点来提高模型的稳健性并获得更好的性能。在本文中,我们认为,如果将这些SLU数据集一起考虑,可以共同学习不同数据集的不同知识,并且有很高的机会促进每个数据集的性能。与此同时,我们进一步尝试在统一多个数据集时防止数据泄漏,这些数据集可以在行业设置中更有用。为此,我们提出了一个联合学习框架,它可以统一各种类型的数据集以及学习和融合各种类型的知识,即文本表示,从不同的数据集和任务,而不存在下游任务数据的共享。融合的文本表示合并来自不同SLU数据集和任务的有用功能,因此比单独的单独任务中的原始文本表示更强大。最后,为了为我们的框架提供多粒度文本表示,我们提出了一种新颖的多视图编码器(MV-COMODER)作为我们联合学习框架的骨干。在两个SLU基准数据库上的实验,包括两个任务(意图检测和插槽填充)和联合学习设置(横向联合学习,垂直联合学习和联合转移学习),证明了我们方法的有效性和普遍性。具体而言,我们能够对意向检测度量准确度提高1.53%。我们还可以在插槽填充度量F1上提升强大基线的性能高达5.29%。此外,通过利用伯特作为额外的编码器,我们在剪辑和ATIS数据集上建立新的最先进结果,在意图检测任务的准确性和97.20%和96.41方面,我们获得99.33%和98.28%。分别对插槽填充任务的FL分数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号