首页> 外文会议>International Conference on Data and Software Engineering >Using dictionary in a knowledge based algorithm for clustering short texts in Bahasa Indonesia
【24h】

Using dictionary in a knowledge based algorithm for clustering short texts in Bahasa Indonesia

机译:在基于知识的算法中使用字典来聚类Bahasa印度尼西亚的短文本

获取原文

摘要

Text clustering is important in many application of information retrieval. This paper presents a study of clustering short texts in Bahasa Indonesia using semantic similarity approach where dictionary of synonyms and hyponyms is used to get information on word relatedness. We compare sentence similarity calculations based on lexical matching and word similarity. More than 250 sentences are involved. Our experiment shows that clustering using sentence similarity based on lexical matching performs better in terms of precision and F-measure than clustering using sentence similarity based on semantic approach.
机译:文本群集在许多信息检索时都很重要。本文介绍了使用语义相似性方法在Bahasa印度尼西亚的群集短信的研究,同义词词典和Shofyms的字典用于获取有关Word相关性的信息。我们基于词汇匹配和单词相似性比较句子相似性计算。涉及超过250个句子。我们的实验表明,基于词汇匹配的句子相似性的聚类在精度和F测量方面的使用比使用基于语义方法的句子相似度更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号