【24h】

Evaluating Composition Models for Verb Phrase Elliptical Sentence Embeddings

机译:动词短语椭圆句嵌入的构图模型评估

获取原文

摘要

Ellipsis is a natural language phenomenon where part of a sentence is missing and its information must be recovered from its surrounding context, as in "Cats chase dogs and so do foxes.". Formal semantics has different methods for resolving ellipsis and recovering the missing information, but the problem has not been considered for distributional semantics, where words have vector embeddings and combinations thereof provide em-beddings for sentences. In elliptical sentences these combinations go beyond linear as copying of elided information is necessary. In this paper, we develop different models for embedding VP-elliptical sentences. We extend existing verb disambiguation and sentence similarity datasets to ones containing elliptical phrases and evaluate our models on these dalasets for a variety of non-linear combinations and their linear counterparts. We compare results of these compositional models to slate of the art holistic sentence encoders. Our results show that non-linear addition and a non-linear tensor-based composition outperform the naive non-compositional baselines and the linear models, and that sentence encoders perform well on sentence similarity. but not on verb disambiguation.
机译:省略号是一种自然语言现象,其中缺少句子的一部分,必须从其周围的上下文中恢复其信息,例如“猫追狗,狐狸追猫”。形式语义学具有解决省略号和恢复丢失信息的不同方法,但是对于分布语义学则没有考虑该问题,在分布语义学中,单词具有向量嵌入,其组合为句子提供嵌入。在省略句中,这些组合超出了线性,因为必须复制被忽略的信息。在本文中,我们开发了用于嵌入VP椭圆句的不同模型。我们将现有的动词消歧和句子相似性数据集扩展到包含椭圆短语的数据集,并在这些dalaset上针对各种非线性组合及其线性对应物评估我们的模型。我们将这些构图模型的结果与现有的整体句子编码器进行比较。我们的结果表明,非线性加法和基于非线性张量的构图优于单纯的非组合基线和线性模型,并且句子编码器在句子相似度上表现良好。但不能消除动词歧义。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号