Syntax-augmented Multilingual BERT for Cross-lingual Transfer

机译：语法增强的多语言杆，用于交叉转移

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In recent years, we have seen a colossal effort in pre-training multilingual text encoders using large-scale corpora in many languages to facilitate cross-lingual transfer learning. However, due to typological differences across languages, the cross-lingual transfer is challenging. Nevertheless, language syntax, e.g.. syntactic dependencies, can bridge the typological gap. Previous works have shown that pre-trained multilingual encoders, such as mBERT (Devlin et al., 2019), capture language syntax, helping cross-lingual transfer. This work shows that explicitly providing language syntax and training mBERT using an auxiliary objective to encode the universal dependency tree structure helps cross-lingual transfer. We perform rigorous experiments on four NLP tasks, including text classification, question answering, named entity recognition, and task-oriented semantic parsing. The experiment results show that syntax-augmented mBERT improves cross-lingual transfer on popular benchmarks, such as PAWS-X and MLQA, by 1.4 and 1.6 points on average across all languages. In the generalized transfer setting, the performance boosted significantly, with 3.9 and 3.1 points on average in PAWS-X and MLQA.

机译：近年来，我们在使用许多语言中使用大型语料的预训练多语言文本编码器中看到了巨大的努力，以促进交叉转移学习。然而，由于语言的类型学差异，交叉传输是具有挑战性的。然而，语言语法，例如语法依赖关系，可以弥合类型的差距。以前的作品表明，预训练的多语言编码器，如MBERT（Devlin等，2019），捕获语言语法，帮助交叉传输。这项工作表明，使用辅助目标来编码通用依赖树结构的语言语法和训练Mbert有助于交叉传输。我们在四个NLP任务中执行严格的实验，包括文本分类，问题应答，命名实体识别以及面向任务的语义解析。实验结果表明，Syntax-Augmented Mbert在各种语言中提高了流行基准的交叉转移，例如PAWS-X和MLQA，平均平均为1.4和1.6点。在广义转移设定中，性能显着提升，PAWS-X和MLQA平均为3.9和3.1点。

著录项

来源
《Annual Meeting of the Association for Computational Linguistics;International Joint Conference on Natural Language Processing》|2021年|4538-4554|共17页
会议地点
作者
Wasi Uddin Ahmad; Haoran Li; Kai-Wei Chang; Yashar Mehdad;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
入库时间 2022-08-26 13:58:31

相似文献

外文文献
中文文献
专利

1. 用于增强型数字地图的特殊交叉路口道路建模 [J] . 倪培洲, 李旭, 夏亮, 东南大学学报（英文版） . 2020,第003期
2. Cross-Lingual Topic Discovery From Multilingual Search Engine Query Log [J] . Di Jiang, Yongxin Tong, Yuanfeng Song ACM Transactions on Information Systems . 2017,第2期

机译：从多语言搜索引擎查询日志中发现跨语言主题
3. Cross-lingual query expansion in multilingual folksonomies: A case study on Flickr [J] . Jason J.Jung Knowledge-Based Systems . 2013,第Apra期

机译：多语言民俗分类中的跨语言查询扩展：以Flickr为例
4. Exploiting Wikipedia for cross-lingual and multilingual information retrieval [J] . P. Sorg, P. Cimiano Data & Knowledge Engineering . 2012,第期

机译：利用Wikipedia进行跨语言和多语言信息检索
5. First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT [C] . Benjamin Muller, Yanai Elazar, Benoit Sagot, Conference of the European Chapter of the Association for Computational Linguistics . 2021

机译：首先对齐，然后预测：了解多语言伯特的交叉能力
6. Multilingual model using cross-lingual word embeddings based on subword alignment and cross-task projection利用統計を見る [D] . Sakuma Jin 2019

机译：使用基于子词对齐和跨任务投影的跨语言词嵌入的多语言模型
7. Automatic Truecasing of Video Subtitles Using BERT: A Multilingual Adaptable Approach [O] . Ricardo Rei, Nuno Miguel Guerreiro, Fernando Batista -1

机译：使用BERT自动对视频字幕进行装箱：一种多语言自适应方法
8. Cross-Lingual Passage Re-Ranking With Alignment Augmented Multilingual BERT [O] . Dongmei Chen, Sheng Zhang, Xin Zhang, 2020

机译：交叉语言通道重新排名，使用对齐增强多语言伯特

Syntax-augmented Multilingual BERT for Cross-lingual Transfer

摘要

著录项

相似文献

相关主题

期刊订阅