Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation

机译：具有上下文和非上下文子词表示形式的序列标记：多语言评估

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Pretrained contextual and non-contextual sub-word embeddings have become available in over 250 languages, allowing massively multilingual NLP. However, while there is no dearth of pretrained embeddings, the distinct lack of systematic evaluations makes it difficult for practitioners to choose between them. In this work, we conduct an extensive evaluation comparing non-contextual subword embeddings, namely FastText and BPEmb, and a contextual representation method, namely BERT, on multilingual named entity recognition and part-of-speech tagging. We find that overall, a combination of BERT, BPEmb, and character representations works well across languages and tasks. A more detailed analysis reveals different strengths and weaknesses: Multilingual BERT performs well in medium- to high-resource languages, but is outperformed by non-contextual sub-word embeddings in a low-resource setting.

机译：预训练的上下文和非上下文子词嵌入已支持250多种语言，从而允许使用大量的多语言NLP。但是，尽管没有预训练的嵌入，但由于缺乏系统的评估，从业人员很难在它们之间进行选择。在这项工作中，我们进行了广泛的评估，比较了非上下文子词嵌入（即FastText和BPEmb）和上下文表示方法（即BERT）在多语言命名实体识别和词性标记上的作用。我们发现，总体而言，BERT，BPEmb和字符表示的组合在各种语言和任务中均能很好地工作。更详细的分析揭示了不同的优缺点：多语言BERT在中高资源语言中表现良好，但在低资源环境中的非上下文子词嵌入效果优于。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2019年|273-291|共19页
会议地点
作者
Benjamin Heinzerling; Michael Strube;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
入库时间 2022-08-26 13:54:07

相似文献

外文文献
中文文献
专利

1. Contextual and non-contextual performance evaluation of edge detectors [J] . T.B. Nguyen, D. Ziou Pattern recognition letters . 2000,第9期

机译：边缘检测器的上下文和非上下文性能评估
2. Contextual and non-contextual performance evaluation of edge detectors [J] . T.B. Nguyen, D. Ziou Pattern recognition letters . 2000,第9期

机译：边缘检测器的上下文和非上下文性能评估
3. Contextual and non-contextual performance evaluation of edge detectors [J] . T.B. Nguyen, D. Ziou Pattern recognition letters . 2000,第9期

机译：边缘探测器的上下文和非上下文性能评估
4. Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation [C] . Benjamin Heinzerling, Michael Strube Annual meeting of the Association for Computational Linguistics . 2019

机译：使用上下文和非上下文子字表示的序列标记：多语言评估
5. Multilingual model using cross-lingual word embeddings based on subword alignment and cross-task projection利用統計を見る [D] . Sakuma Jin 2019

机译：使用基于子词对齐和跨任务投影的跨语言词嵌入的多语言模型
6. Sensation Seeking Non-contextual Decision Making and Driving Abilities As Measured through a Moped Simulator [O] . Evelyn Gianfranchi, Mariaelena Tagliabue, Andrea Spoto, -1

机译：通过轻便摩托车模拟器测量的感觉寻求非上下文决策和驾驶能力
7. Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation [O] . Benjamin Heinzerling, Michael Strube 2019

机译：使用上下文和非上下文子字表示的序列标记：多语言评估

Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation

摘要

著录项

相似文献

相关主题

期刊订阅