首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Vector space semantics with frequency-driven motifs
【24h】

Vector space semantics with frequency-driven motifs

机译:具有频率驱动主题的向量空间语义

获取原文

摘要

Traditional models of distributional semantics suffer from computational issues such as data sparsity for individual lexemes and complexities of modeling semantic composition when dealing with structures larger than single lexical items. In this work, we present a frequency-driven paradigm for robust distributional semantics in terms of semantically cohesive lineal constituents, or motifs. The framework subsumes issues such as differential compositional as well as non-compositional behavior of phrasal con-situents, and circumvents some problems of data sparsity by design. We design a segmentation model to optimally partition a sentence into lineal constituents, which can be used to define distributional contexts that are less noisy, semantically more interpretable, and linguistically dis-ambiguated. Hellinger PCA embeddings learnt using the framework show competitive results on empirical tasks.
机译:当处理比单个词汇项更大的结构时,传统的分布式语义模型会遇到计算问题,例如单个词素的数据稀疏性以及建模语义组成的复杂性。在这项工作中,我们提出了一种基于语义内聚的线性成分或图案的稳健分布语义的频率驱动范例。该框架包含诸如短语成分的不同组成和非组成行为之类的问题,并通过设计规避了数据稀疏性的一些问题。我们设计了一种分割模型,以将句子最佳地划分为线性成分,可用于定义噪声较小,语义上更易于解释且语言上没有歧义的分布上下文。使用该框架学习的Hellinger PCA嵌入在经验性任务上显示出竞争性结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号