【24h】

Generalizing and Improving Bilingual Word Embedding Mappings with a Multi-Step Framework of Linear Transformations

机译:用线性变换的多步框架揭示和改进双语词嵌入映射

获取原文

摘要

Using a dictionary to map independently trained word embeddings to a shared space has shown to be an effective approach to learn bilingual word embeddings. In this work, we propose a multi-step framework of linear transformations that generalizes a substantial body of previous work. The core step of the framework is an orthogonal transformation, and existing methods can be explained in terms of the additional normalization, whitening, re-weighting, de-whitening and dimensionality reduction steps. This allows us to gain new insights into the behavior of existing methods, including the effectiveness of inverse regression, and design a novel variant that obtains the best published results in zero-shot bilingual lexicon extraction. The corresponding software is released as an open source project.
机译:使用要将字典映射的独立培训的单词嵌入到共享空间已显示是学习双语单词嵌入的有效方法。 在这项工作中,我们提出了一种多步框架,概括了以前的工作的大量工作。 框架的核心步骤是正交变换,并且可以根据附加标准化,美白,重求,去美化和维度减少步骤来解释现有方法。 这使我们可以获得新的见解,以实现现有方法的行为,包括逆回归的有效性,并设计一种获得最佳发布结果的新型变体,以获得零射击双语词典提取。 相应的软件被释放为开源项目。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号