首页> 外文期刊>Natural Hazards >Real-time identification of urban rainstorm waterlogging disasters based on Weibo big data
【24h】

Real-time identification of urban rainstorm waterlogging disasters based on Weibo big data

机译:基于微博大数据的城市暴雨涝灾难的实时识别

获取原文
获取原文并翻译 | 示例
           

摘要

With the acceleration of urbanisation in China, preventing and reducing the economic losses and casualties caused by urban rainstorm waterlogging disasters have become a critical and difficult issue that the government is concerned about. As urban storms are sudden, clustered, continuous, and cause huge economic losses, it is difficult to conduct emergency management. Developing a more scientific method for real-time disaster identification will help prevent losses over time. Examining social media big data is a feasible method for obtaining on-site disaster data and carrying out disaster risk assessments. This paper presents a real-time identification method for urban-storm disasters using Weibo data. Taking the June 2016 heavy rainstorm in Nanjing as an example, the obtained Weibo data are divided into eight parts for the training data set and two parts for the testing data set. It then performs text pre-processing using the Jieba segmentation module for word segmentation. Then, the term frequency-inverse document frequency method is used to calculate the feature items weights and extract the features. Hashing algorithms are introduced for processing high-dimensional sparse vector matrices. Finally, the naive Bayes, support vector machine, and random forest text classification algorithms are used to train the model, and a test set sample is introduced for testing the model to select the optimal classification algorithm. The experiments showed that the naive Bayes algorithm had the highest macro-average accuracy.
机译:随着中国的城市化加速,预防和减少城市暴雨涝灾造成的经济损失和伤亡人士成为政府关注的重要和困难问题。随着都市风暴是突然的,聚集,连续的,造成巨大的经济损失,很难进行应急管理。开发更科学的实时灾害识别方法将有助于阻止随着时间的推移损失。检查社交媒体大数据是获取现场灾难数据并进行灾害风险评估的可行方法。本文介绍了使用微博数据的城市风暴灾害的实时识别方法。以2016年6月在南京的大暴雨为例,所获得的微博数据分为八个部分,用于训练数据集和测试数据集的两部分。然后,它使用Jieba分段模块进行文字预处理进行单词分割。然后,术语频率 - 逆文档频率方法用于计算特征项权重并提取特征。引入散列算法用于处理高维稀疏向量矩阵。最后,使用野贝雷斯,支持向量机和随机林文本分类算法用于训练模型,并引入测试集样品以测试模型以选择最佳分类算法。实验表明,Naive Bayes算法具有最高的宏观平均精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号