How to Tame Your Data: Data Augmentation for Dialog State Tracking

机译：如何驯服您的数据：对话框状态跟踪的数据增强

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Dialog State Tracking (DST) is a problem space in which the effective vocabulary is practically limitless. For example, the domain of possible movie titles or restaurant names is bound only by the limits of language. As such, DST systems often encounter out-of-vocabulary words at inference time that were never encountered during training. To combat this issue, we present a targeted data augmentation process, by which a practitioner observes the types of errors made on held-out evaluation data, and then modifies the training data with additional corpora to increase the vocabulary size at training time. Using this with a RoBERTa-based Transformer architecture, we achieve state-of-the-art results in comparison to systems that only mask trouble slots with special tokens. Additionally, we present a data-representation scheme for seamlessly retargeting DST architectures to new domains.

机译：对话框状态跟踪（DST）是一个问题空间，其中有效词汇实际上是无限的。例如，可能的电影标题或餐馆名称的域只均受语言的限制。因此，DST系统经常在培训期间从未遇到的推理时间遇到过词的单词。为了打击这个问题，我们提出了一个有针对性的数据增强过程，从业者观察到了一项关于举出的评估数据的错误类型，然后将培训数据与额外的Corpora修改，以增加培训时间的词汇量。使用此功能与基于Roberta的变压器架构，我们实现了最先进的导致系统相比，只有仅使用特殊令牌的错误插槽。此外，我们提供了一种数据表示方案，用于将DST架构无缝回溯到新域。

著录项

来源
《Workhshop on NLP for Conversational AI》|2020年|32-37|共6页
会议地点
作者
Adam Summerville; Jordan Hashemi; James Ryan; William Ferguson;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Object-adaptive LSTM network for real-time visual tracking with adversarial data augmentation [J] . Neurocomputing . 2020,第Apra7期

机译：自适应对象LSTM网络，通过对抗性数据增强进行实时视觉跟踪
2. CNN tracking based on data augmentation [J] . Wang Yong, Wei Xian, Tang Xuan, Knowledge-Based Systems . 2020,第Apra22期

机译：基于数据增强的CNN跟踪
3. The environmental-data automated track annotation (Env-DATA) system: linking animal tracks with environmental data [J] . Somayeh Dodge, Gil Bohrer, Rolf Weinzierl, Movement Ecology . 2013,第1期

机译：环境数据自动轨迹注释（Env-DATA）系统：将动物轨迹与环境数据链接
4. Variational Hierarchical Dialog Autoencoder for Dialog State Tracking Data Augmentation [C] . Kang Min Yoo, Hanbit Lee, Franck Dernoncourt, Conference on Empirical Methods in Natural Language Processing . 2020

机译：对话框状态跟踪数据增强的变分层对话框AutoEncoder
5. Evaluation of Synthetic Training Data and Training-Data-Augmentation Techniques for Object Detection in Ground-Penetrating Radar Data using Deep-Learning Models [D] . Ruggiero, Jean. 2021

机译：使用深度学习模型评估用于地面穿透雷达数据的对象检测的综合训练数据和训练数据增强技术
6. The environmental-data automated track annotation (Env-DATA) system: linking animal tracks with environmental data [O] . Somayeh Dodge, Gil Bohrer, Rolf Weinzierl, 2013

机译：环境数据自动轨迹注释（Env-DATA）系统：将动物轨迹与环境数据链接
7. Dialog State Tracking with Reinforced Data Augmentation [O] . Yichun Yin, Lifeng Shang, Xin Jiang, 2020

机译：使用加强数据增强的对话框状态跟踪
8. Taming Big Data Variety in the Earth Observing System Data and Information System. [R] . Lynnes, C., Walter, J. 2015

机译：驯服地球观测系统数据和信息系统中的大数据变化。

How to Tame Your Data: Data Augmentation for Dialog State Tracking

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅