首页> 外文会议>IEEE/WIC/ACM International Conference on Web Intelligence >Finding Semantic Relationships in Folksonomies
【24h】

Finding Semantic Relationships in Folksonomies

机译:在民俗分类法中寻找语义关系

获取原文
获取外文期刊封面目录资料

摘要

In this paper we study the problem of finding semantic relationships between folksonomy tags. We investigate different methods used to embed tags in the vector space and find similarities between them using word embedding vectors. We also present two new methods for embedding tags in the vector space utilizing labeled Latent Dirichlet Allocation (LDA) and Wikipedia category links. Related tags are grouped into communities using an overlapping community detection technique. In order to evaluate tag embedding methods, we use three different evaluation metrics, two of them do not require a ground truth dataset and the third is based on a manually created dataset of ground truth communities. Our results show that representing folksonomy tags using bag of words and embedding this representation in the vector space yields the best results compared to embedding co-occurring tags only or embedding tags along with textual content of tagged documents. We also compare between using word embedding, Latent Semantic Indexing (LSI), and LDA to find similarities between bag of words representations of tags. We show that word embedding outperforms LSI in one representation, while LDA is hard to beat.
机译:在本文中,我们研究了寻找民俗分类标签之间的语义关系的问题。我们研究了用于在矢量空间中嵌入标签的不同方法,并使用词嵌入矢量找到了它们之间的相似性。我们还介绍了两种利用标记的潜在Dirichlet分配(LDA)和Wikipedia类别链接将标记嵌入矢量空间的新方法。使用重叠的社区检测技术将相关标签分组为社区。为了评估标签嵌入方法,我们使用了三种不同的评估指标,其中两种不需要地面实况数据集,而第三种基于手动创建的地面实况社区数据集。我们的结果表明,与仅嵌入同现标签或嵌入标签以及标签文档的文本内容相比,使用词袋表示民俗分类标签并将此表示嵌入矢量空间可产生最佳结果。我们还比较了使用单词嵌入,潜在语义索引(LSI)和LDA来查找标签的单词表示袋之间的相似性。我们展示了词嵌入在一种表现上胜过LSI,而LDA却很难被击败。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号