首页> 外文会议>International conference on Asian language processing >Domain adaption based on lda and word embedding in SMT
【24h】

Domain adaption based on lda and word embedding in SMT

机译:SMT中基于lda和词嵌入的领域自适应

获取原文

摘要

Current methods about domain adaption in SMT mostly assume that a small in-domain sample is need at training time. However, the fact target domain may not be known at training time so that it may not satisfy the fact translation or is far away from user needs. We instead propose a more suitable method to avoid this situation. Our methods mainly contain two sections (1) Firstly, we use word embedding and LDA model to divide the training corpus into some similar semantic subdomains. (2) Secondly, for an actual source sentences we can select a more suitable translation system by semantic clues. We implement experiments on two language pairs. We can observe consistent improvements over three baselines.
机译:当前有关SMT中域适应的方法主要假设在训练时需要一个较小的域内样本。但是,事实目标域在训练时可能未知,因此它可能无法满足事实转换或远离用户需求。相反,我们提出了一种更合适的方法来避免这种情况。我们的方法主要包括两个部分(1)首先,我们使用词嵌入和LDA模型将训练语料库划分为一些相似的语义子域。 (2)其次,对于实际的源句子,我们可以通过语义线索选择更合适的翻译系统。我们在两种语言对上进行实验。我们可以观察到在三个基准上的持续改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号