Context Sensitive Neural Lemmatization with Lematus

机译：情境敏感的Lematus神经起病

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The main motivation for developing context-sensitive lemmatizers is to improve performance on unseen and ambiguous words. Yet previous systems have not carefully evaluated whether the use of context actually helps in these cases. We introduce Lematus, a lemma-tizer based on a standard encoder-decoder architecture, which incorporates character-level sentence context. We evaluate its lemmatization accuracy across 20 languages in both a full data setting and a lower-resource setting with 10k training examples in each language. In both settings, we show that including context significantly improves results against a context-free version of the model. Context helps more for ambiguous words than for unseen words, though the latter has a greater effect on overall performance differences between languages. We also compare to three previous context-sensitive lemmatization systems, which all use pre-extracted edit trees as well as hand-selected features and/or additional sources of information such as tagged training data. Without using any of these, our context-sensitive model outperforms the best competitor system (Lemming) in the full-data setting, and performs on par in the lower-resource setting.

机译：开发上下文敏感词形修饰符的主要动机是提高对看不见且含糊的单词的性能。然而，先前的系统尚未仔细评估上下文的使用在这些情况下是否真正有所帮助。我们介绍Lematus，这是一种基于标准编码器-解码器体系结构的引理器，它结合了字符级句子上下文。我们以完整的数据设置和资源较少的设置（每种语言有10k训练示例）来评估其在20种语言中的词素化准确性。在这两种设置中，我们都表明，相对于无上下文版本的模型，包含上下文可显着改善结果。与不可见的单词相比，上下文对不明确的单词的帮助更大，尽管后者对语言之间的总体性能差异的影响更大。我们还比较了三个以前的上下文相关的词形还原系统，它们都使用了预提取的编辑树以及手动选择的特征和/或其他信息源，例如标记的训练数据。在不使用任何这些方法的情况下，我们的上下文相关模型在完整数据设置中的表现优于最佳竞争者系统（Lemming），而在资源较少的设置中却表现出色。

著录项

来源
《Annual conference of the North American Chapter of the Association for Computational Linguistics: human language technologies》|2018年|1391-1400|共10页
会议地点
作者
Toms Bergmanis; Sharon Goldwater;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
入库时间 2022-08-26 13:51:17

相似文献

外文文献
中文文献
专利

1. Assessment of Mixed Sward Using Context Sensitive Convolutional Neural Networks [J] . Christopher J. Bateman, Jaco Fourie, Jeffrey Hsiao, Frontiers in Plant Science . 2020,第12期

机译：使用上下文敏感卷积神经网络评估混合草地
2. Context-sensitive normalization of social media text in bahasa Indonesia based on neural word embeddings [J] . Renny Pradina Kusumawardani, Stezar Priansya, Faizal Johan Atletiko Procedia Computer Science . 2018,第22期

机译：基于神经词嵌入的印尼巴哈萨语社交媒体文本的上下文相关标准化
3. Probabilistic speech feature extraction with context-sensitive Bottleneck neural networks [J] . Martin Woellmer, Bjoern Schuller Neurocomputing . 2014,第maya20期

机译：上下文敏感型瓶颈神经网络的概率语音特征提取
4. Context Sensitive Neural Lemmatization with Lematus [C] . Toms Bergmanis, Sharon Goldwater Annual conference of the North American Chapter of the Association for Computational Linguistics: human language technologies . 2018

机译：与lematus的背景敏感神经释放
5. A neural model of attention, perceptual grouping, and context-sensitive processing in the laminar circuits of visual cortex. [D] . Raizada, Rajeev David Samir. 2001

机译：视觉皮层的层状回路中的注意力，知觉分组和上下文相关处理的神经模型。
6. Assessment of Mixed Sward Using Context Sensitive Convolutional Neural Networks [O] . Christopher J. Bateman, Jaco Fourie, Jeffrey Hsiao, 2020

机译：使用上下文敏感卷积神经网络评估混合草地
7. Context Sensitive Lemmatization Using Two Successive Bidirectional Gated Recurrent Networks [O] . Abhisek Chakrabarty, Onkar Arun Pandit, Utpal Garain 2017

机译：背景敏感的lemmatization使用两个连续的双向门控复发网络

Context Sensitive Neural Lemmatization with Lematus

摘要

著录项

相似文献

相关主题

期刊订阅