首页> 外文会议>Eighth joint conference on lexical and computational semantics >Pre-trained Contextualized Character Embeddings Lead to Major Improvements in Time Normalization: a Detailed Analysis
【24h】

Pre-trained Contextualized Character Embeddings Lead to Major Improvements in Time Normalization: a Detailed Analysis

机译:预训练的上下文化字符嵌入导致时间归一化的重大改进:详细分析

获取原文
获取原文并翻译 | 示例

摘要

Recent studies have shown that pre-trained contextual word embeddings, which assign the same word different vectors in different contexts, improve performance in many tasks. But while contextual embeddings can also be trained at the character level, the effectiveness of such embeddings has not been studied. We derive character-level contextual embeddings from Flair (Akbik et al., 2018), and apply them to a time normalization task, yielding major performance improvements over the previous state-of-the-art: 51% error reduction in news and 33% in clinical notes. We analyze the sources of these improvements, and find that pre-trained contextual character embeddings are more robust to term variations. infrequent terms, and cross-domain changes. We also quantify the size of context that pre-trained contextual character embeddings take advantage of, and show that such embeddings capture features like part-of-speech and capitalization.
机译:最近的研究表明,在不同上下文中为同一单词分配不同矢量的预训练上下文上下文嵌入可以提高许多任务的性能。但是,尽管上下文嵌入也可以在角色级别上进行训练,但尚未研究此类嵌入的有效性。我们从Flair(Akbik et al。,2018)导出了字符级上下文嵌入,并将其应用于时间归一化任务,与以前的最新技术相比,产生了重大的性能改进:新闻错误减少了51%,新闻错误减少了33%临床记录中的%。我们分析了这些改进的源头,发现预训练的上下文字符嵌入对于术语变化更为健壮。术语,以及跨域更改。我们还量化了经过预训练的上下文字符嵌入所利用的上下文大小,并表明此类嵌入具有诸如词性和大写字母之类的特征。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号