首页> 外文OA文献 >A topic analysis approach to revealing discussions on the Australian twittersphere
【2h】

A topic analysis approach to revealing discussions on the Australian twittersphere

机译:一种主题分析方法,用于揭示澳大利亚推特领域的讨论

摘要

This paper investigates techniques to identify the topics being discussed in one week of tweets from the Australian Twittersphere. Tweets were extracted from a comprehensive dataset which captures all tweets by 2.8m Australian: the Tracking Infrastructure for Social Media Analysis (TrISMA) (Bruns, Burgess & Banks et al., 2016). Bruns & Moe (2014) suggest that most Twitter research to date has focussed on “the macro layer of Twitter communication” (p. 23-24), partly because it is methodologically difficult to move beyond this. The TrISMA dataset enables the selection of a dataset based on a date range, rather than being limited to keywords or hashtags. As a result, the extracted one- week dataset of 5.5 million tweets is not focussed on a particular topic, and contains tweets from all three layers of Twitter communication defined by Bruns & Moe (2014), not just predominately from the macro level of hashtag conversations. This study seeks to identify the themes present in this dataset using Latent Dirichlet Allocation (LDA) (Blei, Ng, and Jordan, 2003).udThe results of the topic analysis are triangulated with the themes found by the different types of analysis as part of a wider methodological study determining other metrics for the same week. The ability to identify the themes present in a dataset has many applications, including identifying changes in themes over time, extracting subsets of the corpus for further study, and understanding the diversity of themes present.
机译:本文研究了确定来自澳大利亚Twittersphere的一周推文中正在讨论的主题的技术。推文是从一个综合数据集中提取的,该数据集捕获了280万澳大利亚人的所有推文:社交媒体分析跟踪基础设施(TrISMA)(Bruns,Burgess&Banks等,2016)。 Bruns&Moe(2014)表示,迄今为止,大多数Twitter研究都集中在“ Twitter通信的宏观层面”(第23-24页),部分原因是要从方法论上克服这一困难。 TrISMA数据集使您能够基于日期范围选择数据集,而不仅限于关键字或主题标签。结果,提取的550万条推文的为期一周的数据集并没有专注于特定主题,而是包含了Bruns&Moe(2014)定义的Twitter通信的所有三层推文,而不仅仅是主要来自主题标签的宏级别。对话。本研究旨在使用潜在狄利克雷分配(LDA)来识别此数据集中的主题(Blei,Ng和Jordan,2003年)。 ud主题分析的结果与不同类型的分析所发现的主题进行了三角剖分。较广泛的方法研究确定了同一周的其他指标。识别数据集中存在的主题的能力具有许多应用,包括识别主题随时间的变化,提取语料库的子集以供进一步研究以及理解当前主题的多样性。

著录项

  • 作者

    Moon Brenda;

  • 作者单位
  • 年度 2016
  • 总页数
  • 原文格式 PDF
  • 正文语种
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号