首页> 外文会议>IEEE Symposium Series on Computational Intelligence >Towards Contradiction Detection in German: a Translation-Driven Approach
【24h】

Towards Contradiction Detection in German: a Translation-Driven Approach

机译:探讨德语的矛盾检测:一种平移驱动的方法

获取原文

摘要

With the recent advancements in Machine Learning based Natural Language Processing (NLP), language dependency has always been a limiting factor for a majority of NLP applications. Typically, models are trained for the English language due to the availability of very large labeled and unlabeled datasets, which also allow to fine tune models for that language. Contradiction Detection is one such problem that has found many practical applications in NLP and up to this point has only been studied in the context of English language. The scope of this paper is to examine a set of baseline methods for the Contradiction Detection task on German text. For this purpose, the well-known Stanford Natural Language Inference (SNLI) data set (110,000 sentence pairs) is machine-translated from English to German. We train and evaluate four classifiers on both the original and the translated data, using state-of-the-art textual data representations. Our main contribution is the first large-scale assessment for this problem in German, and a validation of machine translation as a data generation method. We also present a novel approach to learn sentence embeddings by exploiting the hidden states of an encoder-decoder Sequence-To-Sequence RNN trained for autoencoding or translation.
机译:随着基于机器学习的自然语言处理(NLP)的最新进步,语言依赖始终是大多数NLP应用程序的限制因素。通常,由于非常大的标记和未标记的数据集的可用性,模型为英语培训,这也允许为该语言进行微调模型。矛盾检测是在英语语言的背景下已经在NLP中找到了许多实际应用的一个这样的问题,并且在这一点上仅研究了这一点。本文的范围是检查德语文本上的矛盾检测任务的一组基线方法。为此目的,众所周知的斯坦福自然语言推理(SNLI)数据集(110,000句对)是从英语到德语的机器翻译。我们使用最先进的文本数据表示,我们在原始和翻译数据上培训并评估四个分类器。我们主要贡献是德语中这个问题的第一个大规模评估,以及作为数据生成方法的机器转换的验证。我们还通过利用用于自动编码或转换的编码器解码器序列到序列RNN的隐藏状态来提出一种学习句子嵌入的新方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号