Cross-lingual document similarity estimation and dictionary generation with comparable corpora

Stajner Tadej; Mladenic Dunja

首页> 外文期刊>Knowledge and information systems >Cross-lingual document similarity estimation and dictionary generation with comparable corpora

【24h】

Cross-lingual document similarity estimation and dictionary generation with comparable corpora

机译：与可比语料库的交叉语言文档相似性估算与字典代

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper proposes an approach for performing bilingual dictionary generation even when trained on widely available comparable bilingual corpora. We also show its capability to provide cross-lingual similarity estimates that correlate well with human judgments. We implement an approach using a nonlinear bilingual translation model that we train using comparable corpora. We propose a method using word embeddings and kernel approximation to train scalable nonlinear transformations. We demonstrate that this novel method works better on a majority of evaluated language pairs.

机译：本文提出了一种甚至在广泛可用的双语语料库上培训的方式执行双语词典代的方法。我们还表明其能力提供与人类判断相相关的交叉语言相似度估计。我们使用我们使用可比较的基层训练的非线性双语翻译模型来实现一种方法。我们提出了一种使用Word Embeddings和内核近似的方法，以训练可扩展的非线性变换。我们证明，这种新的方法在大多数评估语言对上工作更好。

著录项

来源
《Knowledge and information systems》 |2019年第3期|共15页
作者
Stajner Tadej; Mladenic Dunja;
展开▼
作者单位

Jozef Stefan Inst Jozef Stefan Int Postgrad Sch Jamova Ulica 39 Ljubljana 1000 Slovenia;

Jozef Stefan Inst Jozef Stefan Int Postgrad Sch Jamova Ulica 39 Ljubljana 1000 Slovenia;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动信息理论;
关键词
Cross-lingual text analysis; Vector space machine translation; Representation learning; Comparable corpora; Similarity learning; Dictionary generation;

机译：交叉语言文本分析;矢量空间机翻译;代表学习;可比较的语料;相似性学习;字典代;

相似文献

外文文献
中文文献
专利

1. Cross-lingual document similarity estimation and dictionary generation with comparable corpora [J] . Stajner Tadej, Mladenic Dunja Knowledge and information systems . 2019,第3期

机译：与可比语料库的交叉语言文档相似性估算与字典代
2. Leveraging comparable corpora for cross-lingual information retrieval in resource-lean language pairs [J] . Azadeh Shakery, ChengXiang Zhai Information Retrieval . 2013,第1期

机译：利用可比语料库在资源贫乏的语言对中进行跨语言信息检索
3. Leveraging comparable corpora for cross-lingual information retrieval in resource-lean language pairs [J] . Azadeh Shakery, ChengXiang Zhai Information retrieval . 2013,第1期

机译：利用可比语料库在资源贫乏的语言对中进行跨语言信息检索
4. Citius at SemEval-2017 Task 2: Cross-Lingual Similarity from Comparable Corpora and Dependency-Based Contexts [C] . Pablo Gamallo International workshop on semantic evaluation;Annual meeting of the Association for Computational Linguistics . 2017

机译：Citius在SemEval-2017上的任务2：可比语料库和基于依赖的上下文中的跨语言相似性
5. Automatic term extraction and document similarity in special text corpora. [D] . Dong, Li. 2002

机译：特殊文本语料库中的自动术语提取和文档相似性。
6. A Cross-Lingual Similarity Measure for Detecting Biomedical Term Translations [O] . Danushka Bollegala, Georgios Kontonatsios, Sophia Ananiadou -1

机译：用于检测生物医学术语翻译的跨语言相似性度量
7. Probabilistic Models of Cross-Lingual Semantic Similarity in Context Based on Latent Cross-Lingual Concepts Induced from Comparable Data [O] . Ivan Vulic ́, Marie-francine Moens 2015

机译：基于可比数据诱导的潜在跨语言概念的语境中跨语言语义相似度的概率模型

Cross-lingual document similarity estimation and dictionary generation with comparable corpora

摘要

著录项

相似文献

相关主题

期刊订阅