首页> 外文会议>International Conference on Knowledge Science, Engineering and Management >Frequent Patterns Based Word Network: What Can We Obtain from the Tourism Blogs?
【24h】

Frequent Patterns Based Word Network: What Can We Obtain from the Tourism Blogs?

机译:频繁的基于模式的Word网络:我们可以从旅游博客获取什么?

获取原文

摘要

In this work, we present a method to extract interesting information for a specific reader from massive tourism blog data. To this end, we first introduce the web crawler tool to obtain blog contents from the web and divide them into semantic word segments. Then, we use the frequent pattern mining method to discover the useful frequent 1- and 2-itemset between words after necessary data cleaning. Third, we visualize all the word correlations with a word network. Finally, we propose a local information search method based on the max-confidence measurement that enables the blog readers to specify an interesting topic word to find the relevant contents. We illustrate the benefits of this approach by applying it to a Chinese online tourism blog dataset.
机译:在这项工作中,我们提出了一种从大规模旅游博客数据中提取特定读者的有趣信息的方法。为此,我们首先介绍了Web爬网的工具,以从Web获取博客内容,并将它们分成语义单词段。然后,我们使用频繁的模式挖掘方法在必要的数据清洁后发现单词之间的有用频繁的1和2项集。第三,我们可视化与单词网络的所有单词相关性。最后,我们提出了一种基于最大置信度量的本地信息搜索方法,使博客读取器能够指定一个有趣的主题字来查找相关内容。我们通过将其应用于中国在线旅游博客数据集来说明这种方法的好处。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号