...
首页> 外文期刊>Current Organic Synthesis >The Story of Goldilocks and Three Twitter's APIs: A Pilot Study on Twitter Data Sources and Disclosure
【24h】

The Story of Goldilocks and Three Twitter's APIs: A Pilot Study on Twitter Data Sources and Disclosure

机译:Goldilocks和三个Twitter的故事:关于Twitter数据来源和披露的试验研究

获取原文
获取原文并翻译 | 示例
           

摘要

Public health and social science increasingly use Twitter for behavioral and marketing surveillance. However, few studies provide sufficient detail about Twitter data collection to allow either direct comparisons between studies or to support replication. The three primary application programming interfaces (API) of Twitter data sources are Streaming, Search, and Firehose. To date, no clear guidance exists about the advantages and limitations of each API, or about the comparability of the amount, content, and user accounts of retrieved tweets from each API. Such information is crucial to the validity, interpretation, and replicability of research findings. This study examines whether tweets collected using the same search filters over the same time period, but calling different APIs, would retrieve comparable datasets. We collected tweets about anti-smoking, e-cigarettes, and tobacco using the aforementioned APIs. The retrieved tweets largely overlapped between three APIs, but each also retrieved unique tweets, and the extent of overlap varied over time and by topic, resulting in different trends and potentially supporting diverging inferences. Researchers need to understand how different data sources can influence both the amount, content, and user accounts of data they retrieve from social media, in order to assess the implications of their choice of data source.
机译:公共卫生和社会科学越来越多地利用Twitter进行行为和营销监测。然而,很少有研究为Twitter数据收集提供了足够的细节,以允许研究之间的直接比较或支持复制。 Twitter数据源的三个主应用程序编程接口(API)是流式,搜索和Firehose。迄今为止,没有关于每个API的优点和限制的明确指导,或关于从每个API中检索到的推文的金额,内容和用户帐户的可比性。这些信息对于研究结果的有效性,解释和可重量至关重要。本研究审查了在同一时间段内使用相同的搜索过滤器收集的推文,但调用不同的API,将检索可比的数据集。我们使用上述API收集了关于防吸烟,电子香烟和烟草的推文。检索到的推文主要重叠在三个API之间,但每个也检索着独特的推文,并且重叠程度随着时间的推移和主题而变化,导致不同的趋势和潜在支持发散推断。研究人员需要了解不同的数据源如何影响他们从社交媒体中检索的数据的金额,内容和用户帐户,以评估他们选择数据源的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号