首页> 外文期刊>International journal of pervasive computing and communications >What is your tweet worldview? Mapping the topic structure of tweets on the Wikipedia
【24h】

What is your tweet worldview? Mapping the topic structure of tweets on the Wikipedia

机译:您的推特世界观是什么?在Wikipedia上映射推文的主题结构

获取原文
获取原文并翻译 | 示例
           

摘要

Purpose - This paper aims to propose a method for summarizing the topics of tweets using the Wikipedia category structure as common knowledge for supplementing the understanding of the Twitter user's interests. There are many topics in the tweets, and the topics can be treated as a tree structure. However, when the topic hierarchy is constructed using existing hierarchal clustering approach, the granularity of tweet groups differs for each user. For summarizing the topics, identification of the topics which are heterogeneous and which are not is necessary because it is easy to understand if several groups are categorized into parent groups. However, if the group units are different for each user, a number of users' interests cannot be summarized. If some tweets are grouped into the presidential election, and the others are into Donald Trump, there cannot be a count of how many users are interested in Donald Trump. Design/methodology/approach - One solution of this issue is to construct topic structures by mapping one common tree structure. In this paper, a method is proposed for constructing the topic structure using the Wikipedia category tree similar to a common tree structure. The tweets are categorized, mapped to titles of articles in the Wikipedia category tree and then visualized as the hierarchal structure to the users. Findings - The effectiveness of the proposed hierarchal topic structure is confirmed. In theme "politics", the proposed method works well. The main reason is that there are many technical terms about politics in the Wikipedia categories and articles. It was found that a number of the terms of politics do not have multiple meanings, multiple semantics. However, in theme "sports", the proposed method does not perform well. The main reason for this case is that there are a number of names of people present as topic names. Originality/value - One important feature of the proposed method is that it is easy to grasp not only about the topics which are heterogeneous or homogeneous with each other but also consider the missing time when extracting topics. Another feature is that the topic structures for multiple users are easy to compare with each other.
机译:目的-本文旨在提出一种以Wikipedia类别结构为摘要的推文主题汇总方法,以补充对Twitter用户兴趣的理解。推文中有很多主题,这些主题可以视为树结构。但是,当使用现有的层次结构聚类方法构建主题层次结构时,每个用户的推文组的粒度不同。为了概括主题,识别异构主题和不必要主题是必要的,因为如果将几个组分类为父组很容易理解。但是,如果每个用户的组单位不同,则不能总结多个用户的兴趣。如果将某些推文归类为总统选举,而将其他推文归类为唐纳德·特朗普,则无法计算有多少用户对唐纳德·特朗普感兴趣。设计/方法/方法-此问题的一种解决方案是通过映射一个常见的树结构来构建主题结构。在本文中,提出了一种使用Wikipedia类别树构造主题结构的方法,该方法类似于普通树结构。这些推文经过分类,映射到Wikipedia类别树中的文章标题,然后可视化为用户的层次结构。调查结果-确认了所提出的层次主题结构的有效性。在主题“政治”中,提出的方法效果很好。主要原因是在Wikipedia类别和文章中有许多关于政治的技术术语。人们发现,许多政治术语没有多重含义,多重语义。但是,在主题“体育”中,提出的方法效果不佳。出现这种情况的主要原因是,存在许多人的名字作为主题名称。原创性/价值-所提出方法的一个重要特征是,不仅易于掌握彼此之间异质或同质的主题,而且在提取主题时还考虑了丢失的时间。另一个功能是多个用户的主题结构易于相互比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号