Pre-trained Contextualized Character Embeddings Lead to Major Improvements in Time Normalization: a Detailed Analysis

机译：预训练的上下文化字符嵌入导致时间归一化的重大改进：详细分析

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recent studies have shown that pre-trained contextual word embeddings, which assign the same word different vectors in different contexts, improve performance in many tasks. But while contextual embeddings can also be trained at the character level, the effectiveness of such embeddings has not been studied. We derive character-level contextual embeddings from Flair (Akbik et al., 2018), and apply them to a time normalization task, yielding major performance improvements over the previous state-of-the-art: 51% error reduction in news and 33% in clinical notes. We analyze the sources of these improvements, and find that pre-trained contextual character embeddings are more robust to term variations. infrequent terms, and cross-domain changes. We also quantify the size of context that pre-trained contextual character embeddings take advantage of, and show that such embeddings capture features like part-of-speech and capitalization.

机译：最近的研究表明，在不同上下文中为同一单词分配不同矢量的预训练上下文上下文嵌入可以提高许多任务的性能。但是，尽管上下文嵌入也可以在角色级别上进行训练，但尚未研究此类嵌入的有效性。我们从Flair（Akbik et al。，2018）导出了字符级上下文嵌入，并将其应用于时间归一化任务，与以前的最新技术相比，产生了重大的性能改进：新闻错误减少了51％，新闻错误减少了33％临床记录中的％。我们分析了这些改进的源头，发现预训练的上下文字符嵌入对于术语变化更为健壮。术语，以及跨域更改。我们还量化了经过预训练的上下文字符嵌入所利用的上下文大小，并表明此类嵌入具有诸如词性和大写字母之类的特征。

著录项

来源
《Eighth joint conference on lexical and computational semantics》|2019年|68-74|共7页
会议地点 Minneapolis(US)
作者
Dongfang Xu; Egoitz Laparra; Steven Bethard;
展开▼
作者单位

School of Information University of Arizona Tucson, AZ;

School of Information University of Arizona Tucson, AZ;

School of Information University of Arizona Tucson, AZ;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
入库时间 2022-08-26 14:32:21

相似文献

外文文献
中文文献
专利

1. Transformer based contextualization of pre-trained word embeddings for irony detection in Twitter [J] . Jose Angel Gonzalez, Lluis-F. Hurtado, Ferran Pla Information Processing & Management . 2020,第4期

机译：基于变压器的预训练Word Embeddings的上下文化，在Twitter中进行讽刺检测
2. Pre-trained Language Embedding-based Contextual Summary and Multi-scale Transmission Network for Aspect Extraction [J] . Cong Feng, Yuan Rao, Ambreen Nazir, Procedia Computer Science . 2020,第5期

机译：基于预先接受的语言嵌入的上下文摘要和多尺度传输网络，用于方面提取
3. Sentiment analysis based on improved pre-trained word embeddings [J] . Rezaeinia Seyed Mahdi, Rahmani Rouhollah, Ghodsi Ali, Expert Systems with Application . 2019,第MARa期

机译：基于改进的预训练词嵌入的情感分析
4. Pre-trained Contextualized Character Embeddings Lead to Major Improvements in Time Normalization: a Detailed Analysis [C] . Dongfang Xu, Egoitz Laparra, Steven Bethard Annual conference of the North American Chapter of the Association for Computational Linguistics: human language technologies . 2019

机译：预先接受的上下文化字符嵌入式导致时间正常化的主要改进：详细分析
5. Normalizing TV: A Content Analysis Examining the Quality of Representation of Grey’s Anatomy Character Dr. Miranda Bailey [D] . Rae, Haille Nichole. 2020

机译：正常化电视：审查灰色解剖结构博士米兰达·贝利博士表示的内容分析
6. Spatiotemporal Contextual Uncertainties in Green Space Exposure Measures: Exploring a Time Series of the Normalized Difference Vegetation Indices [O] . Marco Helbich 2019

机译：绿地暴露量度的时空上下文不确定性：探索归一化差异植被指数的时间序列
7. An Investigation of the Interactions Between Pre-Trained Word Embeddings, Character Models and POS Tags in Dependency Parsing [O] . Aaron Smith, Miryam de Lhoneux, Sara Stymne, 2018

机译：依赖于解析中预先训练的单词嵌入，字符模型和POS标记之间的交互的调查
8. The Application of Contextual Analysis in Message Body Optical Character Recognition. [R] . foulkes,john d. 1977

机译：语境分析在消息体光学字符识别中的应用。

Pre-trained Contextualized Character Embeddings Lead to Major Improvements in Time Normalization: a Detailed Analysis

摘要

著录项

相似文献

相关主题

期刊订阅