首页> 外文会议>LREC-2012 >Measuring the compositionality of NV expressions in Basque by means of distributional similarity techniques
【24h】

Measuring the compositionality of NV expressions in Basque by means of distributional similarity techniques

机译:通过分布相似技术测量巴斯克地震中NV表达式的组成性

获取原文

摘要

We present several experiments aiming at measuring the semantic compositionality of NV expressions in Basque. Our approach is based on the hypothesis that compositionality can be related to distributional similarity. The contexts of each NV expression are compared with the contexts of its corresponding components, by means of different techniques, as similarity measures usually used with the Vector Space Model (VSM), Latent Semantic Analysis (LSA) and some measures implemented in the Lemur Toolkit, as Indri index, tf-idf, Okapi index and Kullback-Leibler divergence. Using our previous work, with cooccurrence techniques as a baseline, the results point to improvements using the Indri index or Kullback-Leibler divergence, and a slight further improvement when used in combination with cooccurrence measures such as t-score, via rank-aggregation. This work is part of a project for MWE extraction and characterization using different techniques aiming at measuring the properties related to idiomaticity, as institutionalization, non-compositionality and lexico-syntactic fixedness.
机译:我们在旨在测量巴斯克中NV表达的语义构成性的几个实验。我们的方法是基于该假设,即合成性可能与分布相似性有关。通过不同的技术将每个NV表达的上下文与其相应组件的上下文进行比较,因为通常与矢量空间模型(VSM),潜在语义分析(LSA)和Lemur Toolkit中实现的一些措施一起使用的相似度措施,作为Indri索引,TF-IDF,OKAPI索引和Kullback-Leibler发散。使用我们以前的工作,用Cooccurrence技术作为基线,结果指出了使用Indri指数或Kullback-Leibler发散的改进,并且当与C秩聚集相结合使用时使用QuicCurrence措施(例如T-Score)的逐渐进一步改善。这项工作是使用不同技术的MWE提取和表征的项目的一部分,旨在测量与惯用性有关的性质,作为制度化,非合成性和词典语法固定性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号