首页> 外文期刊>Computational intelligence and neuroscience >Extracting Parallel Sentences from Nonparallel Corpora Using Parallel Hierarchical Attention Network
【24h】

Extracting Parallel Sentences from Nonparallel Corpora Using Parallel Hierarchical Attention Network

机译:使用并行分层注意网络从非平行语料库中提取并行句子

获取原文
           

摘要

Collecting parallel sentences from nonparallel data is a long-standing natural language processing research problem. In particular, parallel training sentences are very important for the quality of machine translation systems. While many existing methods have shown encouraging results, they cannot learn various alignment weights in parallel sentences. To address this issue, we propose a novel parallel hierarchical attention neural network which encodes monolingual sentences versus bilingual sentences and construct a classifier to extract parallel sentences. In particular, our attention mechanism structure can learn different alignment weights of words in parallel sentences. Experimental results show that our model can obtain state-of-the-art performance on the English-French, English-German, and English-Chinese dataset of BUCC 2017 shared task about parallel sentences’ extraction.
机译:从非平行数据收集并行句是一个长期的自然语言处理研究问题。特别是,并行训练句对机器翻译系统的质量非常重要。虽然许多现有方法表明了令人鼓舞的结果,但他们无法在并行句子中学习各种对准权重。为了解决这个问题,我们提出了一种新颖的并行分层关注神经网络,该神经网络编码单语言句子与双语句子构建一个分类器以提取并行句子。特别是,我们的注意机制结构可以在并行句子中学习不同的单词的不同对准权重。实验结果表明,我们的模型可以在英语 - 法语,英语 - 德语和英语 - 汉语 - 中文数据集中获得最先进的表现,以及关于并行句子提取的共享任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号