【24h】

Cross-lingual Semantic Specialization via Lexical Relation Induction

机译:通过词法关系归纳进行跨语言语义专业化

获取原文

摘要

Semantic specialization integrates structured linguistic knowledge from external resources (such as lexical relations in WordNet) into pretrained distributional vectors in the form of constraints. However, this technique cannot be leveraged in many languages, because their structured external resources are typically incomplete or non-existent. To bridge this gap, we propose a novel method that transfers specialization from a resource-rich source language (English) to virtually any target language. Our specialization transfer comprises two crucial steps: 1) Inducing noisy constraints in the target language through automatic word translation: and 2) Filtering the noisy constraints via a state-of-the-art relation prediction model trained on the source language constraints. This allows us to specialize any set of distributional vectors in the target language with the refined constraints. We prove the effectiveness of our method through intrinsic word similarity evaluation in 8 languages, and with 3 downstream tasks in 5 languages: lexical simplification, dialog state tracking, and semantic textual similarity. The gains over the previous state-of-art specialization methods are substantial and consistent across languages. Our results also suggest that the transfer method is effective even for lexically distant source-target language pairs. Finally, as a by-product, our method produces lists of WordNet-style lexical relations in resource-poor languages.
机译:语义专业化将来自外部资源(例如WordNet中的词汇关系)的结构化语言知识以约束的形式集成到预训练的分布向量中。但是,由于许多语言的结构化外部资源通常不完整或不存在,因此无法在多种语言中使用该技术。为了弥合这种差距,我们提出了一种新颖的方法,可以将专业化从资源丰富的源语言(英语)转移到几乎任何目标语言。我们的专业化转移包括两个关键步骤:1)通过自动单词翻译在目标语言中引入噪声约束;以及2)通过对源语言约束进行训练的最新关系预测模型来过滤噪声约束。这使我们能够在目标语言中使用经过改进的约束来专门化任何一组分布矢量。我们通过8种语言的内在单词相似性评估以及5种语言的3个下游任务证明了我们的方法的有效性:词汇简化,对话状态跟踪和语义文本相似性。跨语言,与以前的最新专业化方法相比,收益是巨大且一致的。我们的结果还表明,即使对于词汇上距离遥远的源-目标语言对,这种转移方法也是有效的。最后,作为副产品,我们的方法以资源匮乏的语言生成WordNet样式的词汇关系列表。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号