...
首页> 外文期刊>Language Resources and Evaluation >Cross level semantic similarity: an evaluation framework for universal measures of similarity
【24h】

Cross level semantic similarity: an evaluation framework for universal measures of similarity

机译:跨级别语义相似性:通用相似性度量的评估框架

获取原文
获取原文并翻译 | 示例
           

摘要

Semantic similarity has typically been measured across items of approximately similar sizes. As a result, similarity measures have largely ignored the fact that different types of linguistic item can potentially have similar or even identical meanings, and therefore are designed to compare only one type of linguistic item. Furthermore, nearly all current similarity benchmarks within NLP contain pairs of approximately the same size, such as word or sentence pairs, preventing the evaluation of methods that are capable of comparing different sized items. To address this, we introduce a new semantic evaluation called cross-level semantic similarity (CLSS), which measures the degree to which the meaning of a larger linguistic item, such as a paragraph, is captured by a smaller item, such as a sentence. Our pilot CLSS task was presented as part of SemEval-2014, which attracted 19 teams who submitted 38 systems. CLSS data contains a rich mixture of pairs, spanning from paragraphs to word senses to fully evaluate similarity measures that are capable of comparing items of any type. Furthermore, data sources were drawn from diverse corpora beyond just newswire, including domain-specific texts and social media. We describe the annotation process and its challenges, including a comparison with crowdsourcing, and identify the factors that make the dataset a rigorous assessment of a method's quality. Furthermore, we examine in detail the systems participating in the SemEval task to identify the common factors associated with high performance and which aspects proved difficult to all systems. Our findings demonstrate that CLSS poses a significant challenge for similarity methods and provides clear directions for future work on universal similarity methods that can compare any pair of items.
机译:语义相似度通常是在近似相似大小的项目之间测量的。结果,相似性度量在很大程度上忽略了以下事实:不同类型的语言项可能具有相似甚至相同的含义,因此被设计为仅比较一种类型的语言项。此外,NLP中几乎所有当前的相似性基准都包含近似相同大小的对,例如单词或句子对,从而无法评估能够比较不同大小项目的方法。为了解决这个问题,我们引入了一种新的语义评估,称为跨级语义相似性(CLSS),它测量较大的语言项(例如段落)的含义被较小的项(例如句子)捕获的程度。 。我们的CLSS试验任务是SemEval-2014的一部分,吸引了19个团队提交了38个系统。 CLSS数据包含丰富的成对组合,从段落到词义,以全面评估能够比较任何类型项目的相似性度量。此外,数据来源还来自新闻通讯社之外的各种语料库,包括特定领域的文本和社交媒体。我们描述了注释过程及其挑战,包括与众包的比较,并确定了使数据集成为对方法质量的严格评估的因素。此外,我们详细检查了参与SemEval任务的系统,以确定与高性能相关的常见因素,以及哪些方面对所有系统而言都是困难的。我们的发现表明,CLSS对相似性方法提出了重大挑战,并为将来可以比较任何对项目的通用相似性方法提供了明确的指导。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号