X-SRL: A Parallel Cross-Lingual Semantic Role Labeling Dataset

机译：X-SRL：并行交叉语义角色标记数据集

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Even though SRL is researched for many languages, major improvements have mostly been obtained for English, for which more resources are available. In fact, existing multilingual SRL datasets contain disparate annotation styles or come from different domains, hampering generalization in multilingual learning. In this work we propose a method to automatically construct an SRL corpus that is parallel in four languages: English, French, German, Spanish, with unified predicate and role annotations that are fully comparable across languages. We apply high-quality machine translation to the English CoNLL-09 dataset and use multilingual BERT to project its high-quality annotations to the target languages. We include human-validated test sets that we use to measure the projection quality, and show that projection is denser and more precise than a strong baseline. Finally, we train different SOTA models on our novel corpus for mono-and multilingual SRL, showing that the multilingual annotations improve performance especially for the weaker languages.

机译：尽管SRL用于许多语言，但主要的改进主要是为英语获得的，其中更多资源可用。事实上，现有的多语言SRL数据集包含不同的注释样式或来自不同域，在多语言学习中妨碍泛化。在这项工作中，我们提出了一种自动构建四种语言并行的SRL语料库的方法：英语，法语，德语，西班牙语，具有统一的谓词和横跨语言完全可比的角色注释。我们将高质量的机器翻译应用于英语Conll-09数据集，并使用多语言伯格将其高质量注释项目投影到目标语言。我们包括我们用来测量投影质量的人为验证的测试集，并显示投影是更密集的，比强基线更精确。最后，我们在我们的新型语料库中培训不同的Sota模型，用于单语语言SRL，表明多语言注释提高了表现，特别是对于较弱的语言。

著录项

来源
《Conference on Empirical Methods in Natural Language Processing》|2020年|3904-3914|共11页
会议地点
作者
Angel Daza; Anette Frank;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
入库时间 2022-08-26 13:54:31

相似文献

外文文献
中文文献
专利

1. Cross-lingual Annotation Projection for Semantic Roles [J] . Lapata M., Pado S. The Journal of Artificial Intelligence Research . 2009,第4期

机译：语义角色的跨语言注释投影
2. Cross-lingual Annotation Projection of Semantic Roles [J] . Sebastian Pado, Mirella Lapata The Journal of Artificial Intelligence Research . 2009,第Null期

机译：语义角色的跨语言注释投影
3. MLRSNet: A multi-label high spatial resolution remote sensing dataset for semantic scene understanding [J] . Qi Xiaoman, Zhu Panpan, Wang Yuebin, ISPRS Journal of Photogrammetry and Remote Sensing . 2020,第Nova期

机译：MLRSNET：用于语义场景理解的多标签高空间分辨率遥感数据集
4. Translate and Label! An Encoder-Decoder Approach for Cross-lingual Semantic Role Labeling [C] . Angel Daza, Anette Frank International joint conference on natural language processing;Conference on empirical methods in natural language processing . 2019

机译：翻译并加标签！跨语言语义角色标记的编解码器方法
5. Labeling Large Scale Image Datasets: Exploring Priors, Semantics and Scalability [D] . Jagadeesh, Vignesh 2013

机译：标记大型图像数据集：探索先验，语义和可扩展性
6. Consistent Semantic Annotation of Outdoor Datasets via 2D/3D Label Transfer [O] . Radim Tylecek, Robert B. Fisher 2018

机译：通过2D / 3D标签传输对室外数据集进行一致的语义注释
7. Translate and Label! An Encoder-Decoder Approach for Cross-lingual Semantic Role Labeling [O] . Angel Daza, Anette Frank 2019

机译：翻译和标签！交叉语言语义角色标记的编码器解码方法

X-SRL: A Parallel Cross-Lingual Semantic Role Labeling Dataset

摘要

著录项

相似文献

相关主题

期刊订阅