Using the Wayback Machine to Mine Websites in the Social Sciences: A Methodological Resource

Sanjay K. Arora; Yin Li; Jan Youtie; Philip Shapira

首页> 外文期刊>Journal of the American Society for Information Science >Using the Wayback Machine to Mine Websites in the Social Sciences: A Methodological Resource

【24h】

Using the Wayback Machine to Mine Websites in the Social Sciences: A Methodological Resource

机译：使用Wayback机器挖掘社会科学中的网站：一种方法论资源

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Websites offer an unobtrusive data source for developing and analyzing information about various types of social science phenomena. In this paper, we provide a methodological resource for social scientists looking to expand their toolkit using unstructured web-based text, and in particular, with the Wayback Machine, to access historical website data. After providing a literature review of existing research that uses the Wayback Machine, we put forward a step-by-step description of how the analyst can design a research project using archived websites. We draw on the example of a project that analyzes indicators of innovation activities and strategies in 300 U.S. small- and medium-sized enterprises in green goods industries. We present six steps to access historical Wayback website data: (a) sampling, (b) organizing and defining the boundaries of the web crawl, (c) crawling, (d) website variable operationaliza-tion, (e) integration with other data sources, and (f) analysis. Although our examples draw on specific types of firms in green goods industries, the method can be generalized to other areas of research. In discussing the limitations and benefits of using the Wayback Machine, we note that both machine and human effort are essential to developing a high-quality data set from archived web information.

机译：网站为开发和分析有关各种类型的社会科学现象的信息提供了一个引人入胜的数据源。在本文中，我们为社会科学家提供了一种方法资源，以寻求使用非结构化的基于网络的文本（尤其是使用Wayback Machine）来扩展其工具包来访问历史网站数据。在提供有关使用Wayback Machine的现有研究的文献综述之后，我们就分析师如何使用存档的网站设计研究项目的步骤进行了逐步描述。我们以一个项目为例，该项目分析了300家美国绿色商品行业中小型企业的创新活动和战略指标。我们提供访问历史Wayback网站数据的六个步骤：（a）采样，（b）组织和定义网络爬网的边界，（c）爬网，（d）网站变量可操作性，（e）与其他数据集成来源，以及（f）分析。尽管我们的示例借鉴了绿色商品行业中特定类型的公司，但该方法可以推广到其他研究领域。在讨论使用Wayback Machine的局限性和好处时，我们注意到，从存档的Web信息中开发高质量数据集，机器和人工都至关重要。

著录项

来源
《Journal of the American Society for Information Science》 |2016年第8期|1904-1915|共12页
作者
Sanjay K. Arora; Yin Li; Jan Youtie; Philip Shapira;
展开▼
作者单位

School of Public Policy, Georgia Institute of Technology, Atlanta, GA 30332-0345;

School of Public Policy, Georgia Institute of Technology, Atlanta, GA 30332-0345;

Enterprise Innovation Institute, Georgia Institute of Technology, Atlanta, GA 30308;

Manchester Institute of Innovation Research, Manchester Business School, University of Manchester, Manchester, M13 9PL, UK;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Social Sciences Resources on the Web: A Case Study of SOSIG Website [J] . Bulu Maharana, K. C. Panda Annals of library and information studies . 2003,第3期

机译：网络上的社会科学资源：以SOSIG网站为例
2. Methodologies Gamified as Didactic Resources for Social Sciences [J] . Isabel María Gómez-Trigueros International Journal of Emerging Technologies in Learning (iJET) . 2019,第23期

机译：作为社会科学的教学资源赌博的方法
3. Development of a methodology that integrates environmental and social attributes in the ore resource evaluation and mine planning [J] . Javier I. Munoz, Ronald R. Guzman, Jose A. Botin International Journal of Mining and Mineral Engineering . 2014,第1期

机译：开发一种将环境和社会属性整合到矿石资源评估和矿山规划中的方法
4. Big Open-Source Social Science: Capabilities and methodology for automating social science analytics [C] . Anthony Palladino, Elisa J. Bienenstock, Christopher A. George, SPIE Defense + Security Conference . 2018

机译：大型开源社会科学：自动化社会科学分析的能力和方法
5. Security Analysis Methodology for Student Web Applications: A Case Study of the Mills College Computer Science Department Alumni Website [D] . Diaz, Jennifer. 2018

机译：学生Web应用程序的安全性分析方法：以Mills大学计算机科学系校友网站为例
6. Academic health sciences library Website navigation: an analysis of forty-one Websites and their navigation tools [O] . Stewart M. Brower 2004

机译：学术健康科学图书馆网站导航：对41个网站及其导航工具的分析
7. Using the Wayback Machine to mine websites in the social sciences: A methodological resource [O] . Arora, Sanjay K., Li, Yin, Youtie, Jan, 2015

机译：使用Wayback machine挖掘社会科学中的网站：方法论资源

Using the Wayback Machine to Mine Websites in the Social Sciences: A Methodological Resource

摘要

著录项

相似文献

相关主题

期刊订阅