Data Augmentation for Transformer-based G2P

机译：基于变压器的G2P的数据增强

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The Transformer model has been shown to outperform other neural seq2seq models in several character-level tasks. It is unclear, however, if the Transformer would benefit as much as other seq2seq models from data augmentation strategies in the low-resource setting. In this paper we explore methods for data augmentation in the g2p task together with the Transformer model. Our results show that a relatively simple alignment-based approach of identifying consistent input-output subsequences in grapheme-phoneme data combined with a subsequent splicing together of such pieces to generate hallucinated data works well in the low-resource setting, often delivering substantial performance improvement over a standard Transformer model.

机译：事实证明，在几个字符级任务中，Transformer模型的性能优于其他神经seq2seq模型。但是，目前尚不清楚在低资源环境下，Transformer是否会从数据增强策略中获得与其他seq2seq模型一样多的收益。在本文中，我们将探讨g2p任务中的数据增强方法以及Transformer模型。我们的结果表明，一种基于对齐方式的相对简单的方法可以识别字素-音素数据中一致的输入-输出子序列，并随后将这些片段拼接在一起以生成幻觉的数据，在资源匮乏的情况下效果很好，通常可以显着提高性能在标准的Transformer模型上。

著录项

来源
《SIGMORPHON workshop on computational research in phonetics phonology, and morphology》|2020年|184-188|共5页
会议地点
作者
Zach Ryan; Mans Hulden;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A new method of hybrid time window embedding with transformer-based traffic data classification in IoT-networked environment [J] . Kozik Rafal, Pawlicki Marek, Choras Michal Pattern Analysis and Applications . 2021,第4期

机译：与IOT联网环境中基于变换器的流量数据分类嵌入混合时间窗口的新方法
2. A theoretical comparison of the data augmentation, marginal augmentation and PX-DA algorithms [J] . Hobert JP, Marchev D The Annals of Statistics: An Official Journal of the Institute of Mathematical Statistics . 2008,第2期

机译：数据扩充，边际扩充和PX-DA算法的理论比较
3. Seeking efficient data augmentation schemes via conditional and marginal augmentation [J] . Xiao-Li Meng, David A. Van Dyk Biometrika . 1999,第2期

机译：通过条件和边缘增强寻求高效的数据增强方案
4. Fine-tuning techniques and data augmentation on transformer-based models for conversational texts and noisy user-generated content [C] . Mike Tian-Jian Jiang, Shih-Hung Wu, Yi-Kun Chen, IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining;International Workshop on Mining and Analyzing Social Networks for Decision Support;International Workshop on Social Network Analysis in Applications;Workshop on Social Influence;International Workshop on Social Network Analysis Surveillance Technologies;Workshop on Business Intelligence and Social Networks . 2020

机译：对会话文本和嘈杂的用户生成内容的基于变压器的模型的微调技术和数据增强
5. Evaluation of Synthetic Training Data and Training-Data-Augmentation Techniques for Object Detection in Ground-Penetrating Radar Data using Deep-Learning Models [D] . Ruggiero, Jean. 2021

机译：使用深度学习模型评估用于地面穿透雷达数据的对象检测的综合训练数据和训练数据增强技术
6. Conditional-GAN Based Data Augmentation for Deep Learning Task Classifier Improvement Using fNIRS Data [O] . Sajila D. Wickramaratne, Md.Shaad Mahmud 2021

机译：使用FNIRS数据的条件 - GaN基于深度学习任务分类器的数据增强
7. A new method of hybrid time window embedding with transformer-based traffic data classification in IoT-networked environment [O] . Rafał Kozik, Marek Pawlicki, Michał Choraś 2021

机译：一种新的混合时间窗口与IOT网络环境中的基于变换器的流量数据分类嵌入的新方法

Data Augmentation for Transformer-based G2P

摘要

著录项

相似文献

相关主题

期刊订阅