首页> 外文会议>Workshop on building and using comparable corpora: comparable corpora and the web >Learning the Optimal use of Dependency-parsing Information for Finding Translations with Comparable Corpora
【24h】

Learning the Optimal use of Dependency-parsing Information for Finding Translations with Comparable Corpora

机译:学习最佳使用依赖性解析信息,以便使用可比较的语料来查找翻译

获取原文

摘要

Using comparable corpora to find new word translations is a promising approach for extending bilingual dictionaries (semi-) automatically. The basic idea is based on the assumption that similar words have similar contexts across languages. The context of a word is often summarized by using the bag-of-words in the sentence, or by using the words which are in a certain dependency position, e.g. the predecessors and successors. These different context positions are then combined into one context vector and compared across languages. However, previous research makes the (implicit) assumption that these different context positions should be weighted as equally important. Furthermore, only the same context positions are compared with each other, for example the successor position in Spanish is compared with the successor position in English. However, this is not necessarily always appropriate for languages like Japanese and English. To overcome these limitations, we suggest to perform a linear transformation of the context vectors, which is defined by a matrix. We define the optimal transformation matrix by using a Bayesian probabilistic model, and show that it is feasible to find an approximate solution using Markov chain Monte Carlo methods. Our experiments demonstrate that our proposed method constantly improves translation accuracy.
机译:使用可比较的Corpora寻找新的单词翻译是一种有希望的方法,可以自动扩展双语词典(半)。基本思想是基于假设类似的单词在跨语言具有相似的上下文。通过使用句子中的单词或通过使用某个依赖位置的单词,通常总结一个单词的上下文。前辈和后继者。然后将这些不同的上下文位置组合成一个上下文向量并横跨语言进行比较。然而,以前的研究使得(隐式)假设这些不同的上下文位置应该加权,如同同样重要。此外,只有相同的上下文位置彼此比较,例如将西班牙语中的继承位置与英语中继位置进行比较。但是,这不一定总是适合日语和英语等语言。为了克服这些限制,我们建议执行由矩阵定义的上下文向量的线性变换。我们使用贝叶斯概率模型来定义最佳变换矩阵,并表明使用马尔可夫链蒙特卡罗方法找到近似解决方案是可行的。我们的实验表明,我们的提出方法不断提高平移准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号