首页> 美国卫生研究院文献>International Journal of Environmental Research and Public Health >The Story of Goldilocks and Three Twitter’s APIs: A Pilot Study on Twitter Data Sources and Disclosure
【2h】

The Story of Goldilocks and Three Twitter’s APIs: A Pilot Study on Twitter Data Sources and Disclosure

机译:金发姑娘的故事和三个Twitter API:关于Twitter数据源和披露的初步研究

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Public health and social science increasingly use Twitter for behavioral and marketing surveillance. However, few studies provide sufficient detail about Twitter data collection to allow either direct comparisons between studies or to support replication. The three primary application programming interfaces (API) of Twitter data sources are Streaming, Search, and Firehose. To date, no clear guidance exists about the advantages and limitations of each API, or about the comparability of the amount, content, and user accounts of retrieved tweets from each API. Such information is crucial to the validity, interpretation, and replicability of research findings. This study examines whether tweets collected using the same search filters over the same time period, but calling different APIs, would retrieve comparable datasets. We collected tweets about anti-smoking, e-cigarettes, and tobacco using the aforementioned APIs. The retrieved tweets largely overlapped between three APIs, but each also retrieved unique tweets, and the extent of overlap varied over time and by topic, resulting in different trends and potentially supporting diverging inferences. Researchers need to understand how different data sources can influence both the amount, content, and user accounts of data they retrieve from social media, in order to assess the implications of their choice of data source.
机译:公共卫生和社会科学越来越多地使用Twitter进行行为和市场监视。但是,很少有研究提供有关Twitter数据收集的足够详细信息,以允许在研究之间进行直接比较或支持复制。 Twitter数据源的三个主要应用程序编程接口(API)是Streaming,Search和Firehose。迄今为止,关于每个API的优点和局限性,或从每个API检索到的推文的数量,内容和用户帐户的可比性,尚无明确的指南。此类信息对于研究结果的有效性,解释性和可复制性至关重要。这项研究研究了在相同时间段内使用相同的搜索过滤器但调用不同的API收集的推文是否会检索可比较的数据集。我们使用上述API收集了有关禁烟,电子烟和烟草的推文。检索到的推文在三个API之间基本重叠,但是每个推文还检索了唯一的推文,并且重叠程度随时间和主​​题的不同而变化,从而导致不同的趋势并可能支持不同的推论。研究人员需要了解不同的数据源如何影响他们从社交媒体检索到的数据的数量,内容和用户帐户,以便评估他们选择数据源的含义。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号