首页> 外文期刊>IEEE Transactions on Fuzzy Systems >A Hierarchical Clustering Approach to Fuzzy Semantic Representation of Rare Words in Neural Machine Translation
【24h】

A Hierarchical Clustering Approach to Fuzzy Semantic Representation of Rare Words in Neural Machine Translation

机译:神经机翻译中稀有词模糊语义表示的分层聚类方法

获取原文
获取原文并翻译 | 示例
       

摘要

Rare words are usually replaced with a single token in the current encoder-decoder style of neural machine translation, challenging the translation modeling by an obscured context. In this article, we propose to build a fuzzy semantic representation (FSR) method for rare words through a hierarchical clustering method to group rare words together, and integrate it into the encoder-decoder framework. This hierarchical structure can compensate for the semantic information in both source and target sides, and providing fuzzy context information to capture the semantic of rare words. The introduced FSR can also alleviate the data sparseness, which is the bottleneck in dealing with rare words in neural machine translation. In particular, our method is easily extended to the transformer-based neural machine translation model and learns the FSRs of all in-vocabulary words to enhance the sentence representations in addition to rare words. Our experiments on Chinese-to-English translation tasks confirm a significant improvement in the translation quality brought by the proposed method.
机译:稀有单词通常用单个令牌替换在目前的编码器 - 解码器样式的神经机翻译中,通过一个模糊的背景具有挑战性的翻译模拟。在本文中,我们建议通过分层聚类方法构建一个模糊语义表示(FSR)方法,以通过分层聚类方法将稀有单词组合在一起,并将其集成到编码器解码器框架中。该层级结构可以补偿源极和目标边的语义信息,并提供模糊上下文信息以捕获稀有词语的语义。介绍的FSR还可以缓解数据稀疏,这是在神经机翻译中处理稀有词的瓶颈。特别是,我们的方法很容易扩展到基于变换器的神经机器翻译模型,并了解所有流词汇的FSR,以增强句子表示,除了稀有的单词之外。我们对汉语 - 英语翻译任务的实验证实了所提出的方法所带来的翻译质量的显着改善。

著录项

  • 来源
    《IEEE Transactions on Fuzzy Systems》 |2020年第5期|992-1002|共11页
  • 作者单位

    Harbin Inst Technol Machine Intelligence & Translat Lab Sch Comp Sci Harbin 150001 Peoples R China;

    Microsoft Res Asia Beijing 100080 Peoples R China;

    Harbin Inst Technol Machine Intelligence & Translat Lab Sch Comp Sci Harbin 150001 Peoples R China;

    Harbin Inst Technol Machine Intelligence & Translat Lab Sch Comp Sci Harbin 150001 Peoples R China;

    Harbin Inst Technol Machine Intelligence & Translat Lab Sch Comp Sci Harbin 150001 Peoples R China;

    Harbin Inst Technol Machine Intelligence & Translat Lab Sch Comp Sci Harbin 150001 Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Fuzzy semantic representation (FSR); hierarchical clustering; neural network; neural machine translation (NMT);

    机译:模糊语义表示(FSR);分层聚类;神经网络;神经电脑翻译(NMT);

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号