PARANMT-50M: Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations

机译：PARANMT-50M：通过数百万个机器翻译来突破准句子句嵌入的极限

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We describe ParaNMT-50M. a dataset of more than 50 million English-English sentential paraphrase pairs. We generated the pairs automatically by using neural machine translation to translate the non-English side of a large parallel corpus, following Wieting et al. (2017). Our hope is that PARANMT-50M can be a valuable resource for paraphrase generation and can provide a rich source of semantic knowledge to improve downstream natural language understanding tasks. To show its utility, we use PARANMT-50M to train paraphrastic sentence embeddings that outperform all supervised systems on every SemEval semantic textual similarity competition, in addition to showing how it can be used for paraphrase generation.

机译：我们描述了ParaNMT-50M。超过5,000万英英句子释义对的数据集。我们遵循Wieting等人的方法，通过使用神经机器翻译来翻译大型平行语料库的非英语面来自动生成对。（2017）。我们希望PARANMT-50M可以成为释义生成的宝贵资源，并且可以提供丰富的语义知识来改善下游自然语言理解任务。为了展示其实用性，我们在演示SemEval语义文本相似性竞赛中，使用PARANMT-50M训练了比所有受监管系统都要好的监督短语嵌入，并展示了如何将其用于释义生成。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2018年|451-462|共12页
会议地点
作者
John Wieting; Kevin Gimpel;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
专利

1. Machine Translation Evaluation: Unveiling the Role of Dense Sentence Vector Embedding for Morphologically Rich Language [J] . Tripathi Samiksha, Kansal Vineet International Journal of Pattern Recognition and Artificial Intelligence . 2020,第1期

机译：机器翻译评估：揭示密集句向量嵌入在形态丰富语言中的作用
2. Sentence-Level Combination of Machine Translation Outputs with Syntactically Hybridized Translations [J] . Bo WANG, Yuanyuan ZHANG, Qian XU IEICE transactions on information and systems . 2014,第1期

机译：机器翻译输出与句法混合翻译的句子级组合
3. "This sentence is wrong." Detecting errors in machine-translated sentences [J] . Sylvain Raybaud, David Langlois, Kamel Smaili Machine translation . 2011,第1期

机译：“这句话是错误的。”检测机器翻译句子中的错误
4. PARANMT-50M: Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations [C] . John Wieting, Kevin Gimpel Annual meeting of the Association for Computational Linguistics . 2018

机译：Paranmt-50M：用数百万机翻译推动释放句子的限制
5. Hybrid System Combination for Machine Translation: An Integration of Phrase-level and Sentence-level Combination Approaches. [D] . Ma, Wei-Yun. 2014

机译：机器翻译的混合系统组合：短语级和句子级组合方法的集成。
6. Pushing the Limits: Machine Preservation of the Liver as a Tool to Recondition High-Risk Grafts [O] . Yuri L. Boteon, Simon C. Afford, Hynek Mergental -1

机译：突破极限：机器保存肝脏作为修复高风险移植物的工具
7. ParaNMT-50M: Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations [O] . John Wieting, Kevin Gimpel 2018

机译：Paranmt-50M：用数百万机翻译推动释放句子的限制

PARANMT-50M: Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations

摘要

著录项

相似文献

相关主题

期刊订阅