首页> 美国卫生研究院文献>Sensors (Basel Switzerland) >Iktishaf+: A Big Data Tool with Automatic Labeling for Road Traffic Social Sensing and Event Detection Using Distributed Machine Learning
【2h】

Iktishaf+: A Big Data Tool with Automatic Labeling for Road Traffic Social Sensing and Event Detection Using Distributed Machine Learning

机译:Iktishaf +:具有自动标签的大数据工具用于道路交通社会传感和事件检测使用分布式机器学习

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Digital societies could be characterized by their increasing desire to express themselves and interact with others. This is being realized through digital platforms such as social media that have increasingly become convenient and inexpensive sensors compared to physical sensors in many sectors of smart societies. One such major sector is road transportation, which is the backbone of modern economies and costs globally 1.25 million deaths and 50 million human injuries annually. The cutting-edge on big data-enabled social media analytics for transportation-related studies is limited. This paper brings a range of technologies together to detect road traffic-related events using big data and distributed machine learning. The most specific contribution of this research is an automatic labelling method for machine learning-based traffic-related event detection from Twitter data in the Arabic language. The proposed method has been implemented in a software tool called Iktishaf+ (an Arabic word meaning discovery) that is able to detect traffic events automatically from tweets in the Arabic language using distributed machine learning over Apache Spark. The tool is built using nine components and a range of technologies including Apache Spark, Parquet, and MongoDB. Iktishaf+ uses a light stemmer for the Arabic language developed by us. We also use in this work a location extractor developed by us that allows us to extract and visualize spatio-temporal information about the detected events. The specific data used in this work comprises 33.5 million tweets collected from Saudi Arabia using the Twitter API. Using support vector machines, naïve Bayes, and logistic regression-based classifiers, we are able to detect and validate several real events in Saudi Arabia without prior knowledge, including a fire in Jeddah, rains in Makkah, and an accident in Riyadh. The findings show the effectiveness of Twitter media in detecting important events with no prior knowledge about them.
机译:数字社会的特点是他们越来越多地表达自己并与他人互动的愿望。通过诸如社交媒体的数字平台来实现这一点,这些平台越来越成为智能社会许多部门的物理传感器相比变得方便和廉价的传感器。一个这样一个主要的部门是道路运输,这是现代经济体的骨干,每年都有125万人死亡,每年有5000万人伤害。对与交通相关研究的大数据的社交媒体分析的尖端有限。本文带来了一系列技术,可以使用大数据和分布式机器学习检测道路交通相关的事件。该研究的最具体贡献是一种自动标记方法,用于从阿拉伯语中的Twitter数据获取基于机器学习的业务相关事件检测。该方法已经在一个名为iktishaf +(阿拉伯语词意义发现)的软件工具中实现,该软件工具能够使用分布式计算机学习Apache Spark从阿拉伯语中自动检测交通事件。该工具是使用九个组件和一系列技术建造的,包括Apache Spark,Parquet和MongoDB。 Iktishaf +使用我们开发的阿拉伯语的灯光。我们还在这项工作中使用了我们开发的位置提取器,允许我们提取和可视化有关检测到的事件的时空信息。本工作中使用的具体数据包括使用Twitter API从沙特阿拉伯收集的3350万推文。使用支持向量机,天真贝叶斯和基于逻辑回归的分类器,我们能够检测和验证沙特阿拉伯的几次真正的事件,而无需先验知识,包括吉达的火灾,在麦加下雨以及利雅得的事故。调查结果显示了Twitter媒体在检测重要事件中的有效性,没有关于它们的先验知识。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号