【24h】

Robust Multilingual Part-of-Speech Tagging via Adversarial Training

机译:通过对抗训练进行健壮的多语言词性标注

获取原文

摘要

Adversarial training (AT)~1 is a powerful reg-ularization method for neural networks, aiming to achieve robustness to input perturbations. Yet, the specific effects of the robustness obtained from AT are still unclear in the context of natural language processing. In this paper, we propose and analyze a neural POS tagging model that exploits AT. In our experiments on the Penn Treebank WSJ corpus and the Universal Dependencies (UD) dataset (27 languages), we find that AT not only improves the overall tagging accuracy, but also 1) prevents over-fitting well in low resource languages and 2) boosts tagging accuracy for rare/unseen words. We also demonstrate that 3) the improved tagging performance by AT contributes to the downstream task of dependency parsing, and that 4) AT helps the model to learn cleaner word representations. 5) The proposed AT model is generally effective in different sequence labeling tasks. These positive results motivate further use of AT for natural language tasks.
机译:对抗训练(AT)〜1是一种强大的神经网络正则化方法,旨在实现对输入扰动的鲁棒性。然而,在自然语言处理的背景下,仍不清楚从AT获得的鲁棒性的具体效果。在本文中,我们提出并分析了利用AT的神经POS标记模型。在我们对Penn Treebank WSJ语料库和通用依赖项(UD)数据集(27种语言)的实验中,我们发现AT不仅可以提高整体标记的准确性,而且1)可以防止在低资源语言中很好地适应; 2)提高了稀有/看不见的单词的标签准确性。我们还证明了3)AT改进的标记性能有助于进行依赖项解析的下游任务,并且4)AT帮助模型学习更简洁的词表示形式。 5)提出的AT模型通常在不同的序列标记任务中是有效的。这些积极的结果促使AT进一步用于自然语言任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号