首页> 外文会议>International workshop on database and expert systems applications >Consensus Similarity Measure for Short Text Clustering
【24h】

Consensus Similarity Measure for Short Text Clustering

机译:短文本聚类的共识相似性度量

获取原文

摘要

Measuring semantic similarity between short texts is challenging because the meaning of short texts may vary dramatically even by a few words due to their limited lengths. In this paper, we propose a novel similarity measure for terms that allows better clustering performance than the state-of-the-art method. To achieve such performance, we incorporate knowledge-based and corpus-based term similarity measures in order to exploit advantages of both approaches. We apply our method to a dialog-utterance dataset, which consists of short dialog texts. Empirical study shows that the proposed method outperforms one of the state-of-the-art clustering algorithms for short text clustering.
机译:测量短文本之间的语义相似性具有挑战性,因为由于长度有限,短文本的含义可能会发生巨大变化,即使只有几个词也是如此。在本文中,我们为术语提出了一种新颖的相似性度量,与最新技术方法相比,该度量具有更好的聚类性能。为了实现这种性能,我们结合了基于知识和基于语料库的术语相似性度量,以便利用两种方法的优势。我们将我们的方法应用于由简短对话文本组成的对话话语数据集。实证研究表明,所提出的方法优于短文本聚类的最新聚类算法之一。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号