首页> 外文期刊>Sustainability >Using Web Crawler Technology for Geo-Events Analysis: A Case Study of the Huangyan Island Incident
【24h】

Using Web Crawler Technology for Geo-Events Analysis: A Case Study of the Huangyan Island Incident

机译:使用Web爬网程序技术进行地理事件分析:以黄岩岛事件为例

获取原文
           

摘要

Social networking and network socialization provide abundant text information and social relationships into our daily lives. Making full use of these data in the big data era is of great significance for us to better understand the changing world and the information-based society. Though politics have been integrally involved in the hyperlinked world issues since the 1990s, the text analysis and data visualization of geo-events faced the bottleneck of traditional manual analysis. Though automatic assembly of different geospatial web and distributed geospatial information systems utilizing service chaining have been explored and built recently, the data mining and information collection are not comprehensive enough because of the sensibility, complexity, relativity, timeliness, and unexpected characteristics of political events. Based on the framework of Heritrix and the analysis of web-based text, word frequency, sentiment tendency, and dissemination path of the Huangyan Island incident were studied by using web crawler technology and the text analysis. The results indicate that tag cloud, frequency map, attitudes pie, individual mention ratios, and dissemination flow graph, based on the crawled information and data processing not only highlight the characteristics of geo-event itself, but also implicate many interesting phenomenon and deep-seated problems behind it, such as related topics, theme vocabularies, subject contents, hot countries, event bodies, opinion leaders, high-frequency vocabularies, information sources, semantic structure, propagation paths, distribution of different attitudes, and regional difference of net citizens’ response in the Huangyan Island incident. Furthermore, the text analysis of network information with the help of focused web crawler is able to express the time-space relationship of crawled information and the information characteristic of semantic network to the geo-events. Therefore, it is a useful tool to collect information for understanding the formation and diffusion of web-based public opinions in political events.
机译:社交网络和网络社交化为我们的日常生活提供了丰富的文本信息和社交关系。在大数据时代充分利用这些数据对我们更好地了解瞬息万变的世界和信息社会具有重要意义。尽管自1990年代以来,政治就一直涉及超链接的世界问题,但是地理事件的文本分析和数据可视化面临传统手工分析的瓶颈。尽管最近已经探索和建立了利用服务链自动组装不同地理空间网络和分布式地理空间信息系统的方法,但是由于敏感性,复杂性,相对性,及时性和政治事件的意外特征,数据挖掘和信息收集还不够全面。在Heritrix框架的基础上,基于网络文本分析,利用网络爬虫技术和文本分析方法,研究了黄岩岛事件的词频,情感倾向和传播路径。结果表明,基于爬行信息和数据处理的标签云,频率图,态度饼图,个人提及比例和传播流程图,不仅突显了地质事件本身的特征,而且还隐含了许多有趣的现象和深层影响。背后存在的问题,例如相关主题,主题词汇,主题内容,热点国家,事件主体,意见领袖,高频词汇,信息源,语义结构,传播路径,不同态度的分布以及网民的地区差异在黄岩岛事件中的回应。此外,借助聚焦的Web爬虫,对网络信息进行的文本分析能够表达所爬网信息的时空关系以及语义网络对地理事件的信息特征。因此,它是收集信息以了解政治事件中基于网络的舆论的形成和传播的有用工具。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号