首页> 外文会议>International joint conference on natural language processing >Cross-lingual Semantic Specialization via Lexical Relation Induction
【24h】

Cross-lingual Semantic Specialization via Lexical Relation Induction

机译:通过词汇关系感应交叉语义语义专业化

获取原文

摘要

Semantic specialization integrates structured linguistic knowledge from external resources (such as lexical relations in WordNet) into pretrained distributional vectors in the form of constraints. However, this technique cannot be leveraged in many languages, because their structured external resources are typically incomplete or non-existent. To bridge this gap, we propose a novel method that transfers specialization from a resource-rich source language (English) to virtually any target language. Our specialization transfer comprises two crucial steps: 1) Inducing noisy constraints in the target language through automatic word translation: and 2) Filtering the noisy constraints via a state-of-the-art relation prediction model trained on the source language constraints. This allows us to specialize any set of distributional vectors in the target language with the refined constraints. We prove the effectiveness of our method through intrinsic word similarity evaluation in 8 languages, and with 3 downstream tasks in 5 languages: lexical simplification, dialog state tracking, and semantic textual similarity. The gains over the previous state-of-art specialization methods are substantial and consistent across languages. Our results also suggest that the transfer method is effective even for lexically distant source-target language pairs. Finally, as a by-product, our method produces lists of WordNet-style lexical relations in resource-poor languages.
机译:语义专业化将结构化语言知识从外部资源(如Wordnet中的词汇关系)集成到以约束的形式预先训练的分布矢量。但是,这种技术不能以多种语言利用,因为它们的结构化外部资源通常不完整或不存在。为了弥合这一差距,我们提出了一种新的方法,将专业化从资源丰富的源语(英语)转移到几乎任何目标语言。我们的专业化转移包括两个关键步骤:1)通过自动翻译诱导目标语言的噪声约束:2)通过在源语言约束上训练的最先进的关系预测模型过滤噪声约束。这允许我们专注于目标语言中的任何分布矢量,具有精细的约束。我们通过8种语言中的内在词相似性评估来证明我们的方法的有效性,以及3种以下3种语言的下游任务:词汇简化,对话状态跟踪和语义文本相似性。以前的最先进的专业化方法的收益是跨语言的实质性和一致。我们的结果还表明,即使对于词汇遥控源 - 目标语言对,转移方法也是有效的。最后,作为副产品,我们的方法在资源差的语言中产生了Wordnet风格的词汇关系列表。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号