首页> 外文会议>IJCNLP 2011 >It Takes Two to Tango: A Bilingual Unsupervised Approach for Estimating Sense Distributions Using Expectation Maximization
【24h】

It Takes Two to Tango: A Bilingual Unsupervised Approach for Estimating Sense Distributions Using Expectation Maximization

机译:探戈需要两个:一种使用期望最大化估算感知分布的双语无监督方法

获取原文

摘要

Several bilingual WSD algorithms which exploit translation correspondences between parallel corpora have been proposed. However, the availability of such parallel corpora itself is a tall task for some of the resource constrained languages of the world. We propose an unsupervised bilingual EM based algorithm which relies on the counts of translations to estimate sense distributions. No parallel or sense annotated corpora are needed. The algorithm relies on a synset-aligned bilingual dictionary and in-domain corpora from the two languages. A symmetric generalized Expectation Maximization formulation is used wherein the sense distributions of words in one language are estimated based on the raw counts of the words in the aligned synset in the target language. The overall performance of our algorithm when tested on 4 language-domain pairs is better than current state-of-the-art knowledge based and bilingual unsupervised approaches.
机译:已经提出了几种双语WSD算法,该算法已经提出了并行基层之间的翻译对应。但是,这种平行的Corpora本身的可用性是世界上一些资源受限语言的一个高大任务。我们提出了一种无监督的双语EM基于基于的算法,依赖于估计意义分布的翻译数。不需要并行或感知注释的基层。该算法依赖于来自两种语言的Synset-对齐的双语字典和域内语料库。使用对称的广义期望最大化制剂,其中基于目标语言中的对齐SYNSET中的单词的原始计数来估计一种语言中的单词的感测分布。我们在4语言域对测试时算法的整体性能优于当前最先进的基于技术和双语无监督的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号