首页> 外文会议>19th international world wide web conference 2010 >Access: News and Blog Analysis for the Social Sciences
【24h】

Access: News and Blog Analysis for the Social Sciences

机译:访问:社会科学的新闻和博客分析

获取原文

摘要

The social sciences strive to understand the political, social, and cultural world around us, but have been impaired by limited access to the quantitative data sources enjoyed by the hard sciences. Careful analysis of Web document streams holds enormous potential to solve longstanding problems in a variety of social science disciplines through massive data analysis.This paper introduces the TextMap Access system, which provides ready access to a wealth of interesting statistics on millions of people, places, and things across a number of interesting web corpora. Powered by a flexible and scalable distributed statistics computation framework using Hadoop, continually updated corpora include newspapers, blogs, patent records, legal documents, and scientific abstracts; well over a terabyte of raw text and growing daily. The Lydia Textmap Access system, available through http: / /www. Textmap. Com/access, provides instant access for students and scholars through a convenient web user-interface. We describe the architecture of the TextMap Access system, and its impact on current research in political science, sociology, and business/marketing.
机译:社会科学努力理解我们周围的政治,社会和文化世界,但是由于很难获得硬科学享有的定量数据来源而受到损害。通过大量数据分析,仔细分析Web文档流具有解决各种社会科学学科中长期存在的问题的巨大潜力。 本文介绍了TextMap Access系统,该系统可以立即访问许多有趣的Web语料库上数以百万计的人,地点和事物的有趣统计信息。在使用Hadoop的灵活,可扩展的分布式统计计算框架的支持下,不断更新的语料库包括报纸,博客,专利记录,法律文件和科学文摘;远远超过TB的原始文本,并且每天都在增长。可通过http://www.Lydia Textmap Access系统获得。文字图。 Com / access通过便捷的Web用户界面为学生和学者提供即时访问。我们描述了TextMap Access系统的体系结构,及其对当前政治学,社会学和商业/营销研究的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号