Combating Word-level Adversarial Text with Robust Adversarial Training

机译：通过强大的对抗性训练对抗单词级对抗性文本

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

NLP models perform well on many tasks, but they are also easy to be fooled by adversarial examples. A small perturbation can change the output of the deep neural network model. This kind of perturbation is hard to be perceived by humans, especially adversarial examples generated by word-level adversarial attack. Character-level adversarial attack can be defended by grammar detection and word recognition. The existing word-level textual adversarial attacks are based on synonym replacement, so adversarial texts usually have correct grammar and semantics. The defense of word-level adversarial attack is more challenging. In this paper, we propose a framework which is called Robust Adversarial Training (RAT) to defend against word-level adversarial attacks. RAT enhances the model by combining adversarial training and data perturbation during training. Our experiments on two datasets show that the model based on our framework can effectively defend against word-level adversarial attacks. Compared with the existing defense methods, the model trained under RAT has a higher defense success rate on 1000 adversarial examples. In addition, the accuracy of our model on the standard testing set is also better than the existing defense methods, and the accuracy is very close to or even higher than that of the standard model.

机译：NLP模型在许多任务上表现良好，但它们也很容易被敌对的例子愚弄。小扰动可以改变深层神经网络模型的输出。这种干扰很难被人类察觉，尤其是单词级对抗性攻击产生的对抗性示例。字符级敌对攻击可以通过语法检测和单词识别进行防御。现有的词级文本对抗攻击都是基于同义词替换的，因此对抗文本通常具有正确的语法和语义。单词级对抗性攻击的防御更具挑战性。在本文中，我们提出了一个称为鲁棒对抗训练（RAT）的框架来防御单词级的对抗性攻击。RAT通过结合对抗性训练和训练期间的数据扰动来增强模型。我们在两个数据集上的实验表明，基于我们框架的模型可以有效地抵御单词级的对抗性攻击。与现有的防御方法相比，在鼠下训练的模型在1000个对抗实例上具有更高的防御成功率。此外，我们的模型在标准测试集上的精度也优于现有的防御方法，并且精度非常接近甚至高于标准模型。

著录项

来源
《International Joint Conference on Neural Networks》|2021年|1-8|共8页
会议地点
作者
Xiaohu Du; Jie Yu; Shasha Li; Zibo Yi; Hai Liu; Jun Ma;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Deep learning; Perturbation methods; Semantics; Neural networks; Radio access technologies; Grammar;

机译：训练深度学习;摄动法;语义学;神经网络;无线电接入技术;语法;

相似文献

外文文献
中文文献
专利

1. Jittor-GAN:A fast-training generative adversarial network model zoo based on Jittor [J] . Wen-Yang Zhou, Guo-Wei Yang, Shi-Min Hu 计算可视媒体（英文） . 2021,第001期
2. Multi-feedback Pairwise Ranking via Adversarial Training for Recommender [J] . WANG Jianfang, FU Zhiyuan, NIU Mingxin, 电子学报（英文版） . 2020,第004期
3. Robust Voltage Control Considering Uncertainties of Renewable Energies and Loads via Improved Generative Adversarial Network [J] . Qianyu Zhao, Wenlong Liao, Shouxiang Wang, 现代电力系统与清洁能源学报(英文) . 2020,第006期
4. Generative Adversarial Network Based Heuristics for Sampling-Based Path Planning [J] . Tianyi Zhang, Jiankun Wang, Max Q.-H.Meng 自动化学报（英文版） . 2022,第001期
5. Hierarchical gated recurrent neural network with adversarial and virtual adversarial training on text classification [J] . Poon Hoon-Keng, Yap Wun-She, Tee Yee-Kai, Neural Networks: The Official Journal of the International Neural Network Society . 2019,第期

机译：文本分类对抗性和虚拟对抗培训的分层门控经常性神经网络
6. Robust Graph Neural Networks Against Adversarial Attacks via Jointly Adversarial Training [J] . Hu Tian, Bowei Ye, Xiaolong Zheng, IFAC PapersOnLine . 2020,第5期

机译：通过共同对抗培训，强大的图形神经网络对抗对抗攻击
7. Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training [J] . Xi Wu, Uyeong Jang, Jiefeng Chen, JMLR: Workshop and Conference Proceedings . 2018,第3期

机译：利用对抗训练产生的模型置信度增强对抗鲁棒性
8. Multiple Text Style Transfer by using Word-level Conditional Generative Adversarial Network with Two-Phase Training [C] . Chih-Te Lai, Yi-Te Hong, Hong-You Chen, International joint conference on natural language processing;Conference on empirical methods in natural language processing . 2019

机译：使用词级条件生成对抗网络和两阶段训练进行多文本样式转换
9. Towards Adversarial and Non-Adversarial Robustness of Machine Learning and Signal Processing: Fundamental Limits and Algorithms [D] . Yi, Jirong. 2021

机译：对机器学习和信号处理的侵犯和非对抗性鲁棒性：基本限制和算法
10. Machine learning through cryptographic glasses: combating adversarial attacks by key-based diversified aggregation [O] . Olga Taran, Shideh Rezaeifar, Taras Holotyak, -1

机译：通过密码眼镜进行机器学习：通过基于密钥的多样化聚合来对抗对抗性攻击
11. Multiple Text Style Transfer by using Word-level Conditional Generative Adversarial Network with Two-Phase Training [O] . Chih-Te Lai, Yi-Te Hong, Hong-You Chen, 2019

机译：多种文本样式传输通过使用双相培训使用单词级条件生成的对抗网络传输
12. Robust Feedback Control of Reconfigurable Multi-Agent Systems in Uncertain Adversarial Environments. [R] . Jacobs, J., Sanfelice, R. 2015

机译：不确定对抗环境下可重构多agent系统的鲁棒反馈控制。

Combating Word-level Adversarial Text with Robust Adversarial Training

摘要

著录项

相似文献

相关主题

期刊订阅