Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots

机译：芝麻街的代码混合：对抗性多胶的曙光

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Multilingual models have demonstrated impressive cross-lingual transfer performance. However, test sets like XNLI are monolingual at the example level. In multilingual communities, it is common for polyglots to code-mix when conversing with each other. Inspired by this phenomenon, we present two strong black-box adversarial attacks (one word-level, one phrase-level) for multilingual models that push their ability to handle code-mixed sentences to the limit. The former uses bilingual dictionaries to propose perturbations and translations of the clean example for sense disambiguation. The latter directly aligns the clean example with its translations before extracting phrases as perturbations. Our phrase-level attack has a success rate of 89.75% against XLM-R_(large), bringing its average accuracy of 79.85 down to 8.18 on XNLI. Finally, we propose an efficient adversarial training scheme that trains in the same number of steps as the original model and show that it improves model accuracy.

机译：多语种模型已经表现出令人印象深刻的交叉传输性能。但是，像XNLI这样的测试集在示例级别是单声道。在多语言社区中，彼此交谈时，多胶剂是代码混合。灵感来自这种现象，我们为多语言模型提供了两个强烈的黑匣子对抗性攻击（一个单词级，一个短语级），以便将它们处理到极限的码混合句子的能力。前者使用双语词典来提出清洁歧义的清洁示例的扰动和翻译。后者在将短语作为扰动中提取后，将清洁示例直接对齐。我们的短语级别攻击的成功率为89.75％，XLM-R_（大），将其平均精度为79.85降至XNLI上的8.18。最后，我们提出了一种有效的逆势培训计划，该培训计划将与原始模型相同的步骤列举，并表明它提高了模型精度。

著录项

来源
《Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies》|2021年|3596-3616|共21页
会议地点
作者
Samson Tan; Shafiq Joty;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. The Small and Efficient Language Network of Polyglots and Hyper-polyglots [J] . Jouravlev Olessia, Mineroff Zachary, Blank Idan A., Cerebral cortex . 2021,第1期

机译：小型高效的多胶语言网络和超高胶凝
2. A Multilingual to Polyglot Speech Synthesizer for Indian Languages Using a Voice-Converted Polyglot Speech Corpus [J] . Vijayalakshmi P., Ramani B., Jeeva M. P. Actlin, Circuits, systems, and signal processing . 2018,第5期

机译：使用语音转换的多语种语音语料库的印度语多语言到多语种语音合成器
3. A human-machine adversarial scoring framework for urban perception assessment using street-view images [J] . Yao Yao, Liang Zhaotang, Yuan Zehao, International Journal of Geographical Information Science . 2019,第11a12期

机译：使用街景图像的城市感知评估的人机对抗框架
4. Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots [C] . Samson Tan, Shafiq Joty Workshop on Computational Approaches to Linguistic Code-Switching . 2021

机译：芝麻街的代码混合：对抗性多胶的曙光
5. Motivation themes: Achievement, affiliation, and power in "Sesame Street" from 1969-2013. [D] . Hempstead, Karen Elaine. 2015

机译：动机主题：1969年至2013年在“芝麻街”中取得的成就，隶属关系和权力。
6. Assessment of the sanitary quality of ready to eat sesame a low moisture street food from Burkina Faso [O] . Muller K. A. Compaoré, Bazoin Sylvain Raoul Bazie, Marguerite E. M. Nikiema, 2021

机译：评估准备吃芝麻的卫生质量来自布基纳法索的低水分街食品
7. GREEK-BERT: The Greeks visiting Sesame Street [O] . John Koutsikakis, Ilias Chalkidis, Prodromos Malakasiotis, 2020

机译：希腊伯特：游览芝麻街的希腊人
8. Fire Education for Sesame Street: A Research Study on Mass Media Fire Education for Preschool Children [R] . Peel, B. , Schauble, L. 1979

机译：芝麻街消防教育：学前儿童大众传媒消防教育研究

Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots

摘要

著录项

相似文献

相关主题

期刊订阅