首页> 外文会议>Conference of the European Chapter of the Association for Computational Linguistics >Unsupervised Training for Large Vocabulary Translation Using Sparse Lexicon and Word Classes
【24h】

Unsupervised Training for Large Vocabulary Translation Using Sparse Lexicon and Word Classes

机译:使用稀疏词典和单词类进行大词汇量翻译的无监督培训

获取原文

摘要

We address for the first time unsupervised training for a translation task with hun dreds of thousands of vocabulary words. We scale up the expectation-maximization (EM) algorithm to learn a large translation table without any parallel text or seed lex icon. First, we solve the memory bottle neck and enforce the sparsity with a sim ple thresholding scheme for the lexicon. Second, we initialize the lexicon training with word classes, which efficiently boosts the performance. Our methods produced promising results on two large-scale unsu pervised translation tasks.
机译:我们首次解决了无数种成千上万个单词的翻译任务的无监督培训。我们扩大了期望最大化(EM)算法,以学习没有任何并行文本或种子lex图标的大型翻译表。首先,我们解决了内存瓶颈,并为词典提供了简单的阈值处理方案来实现稀疏性。其次,我们使用单词类初始化词典训练,从而有效地提高了性能。我们的方法在两项未经监督的大规模翻译任务中产生了可喜的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号