首页> 外文会议>IEEE International Conference on Big Data >All in a twitter: Self-tuning strategies for a deeper understanding of a crisis tweet collection
【24h】

All in a twitter: Self-tuning strategies for a deeper understanding of a crisis tweet collection

机译:推特上的一切:自我调整策略,可以更深入地了解危机鸣叫集

获取原文

摘要

Natural disasters have become more frequent during the past 20 years due to significant climate changes. These natural events are hotly debated on social networks like Twitter and a huge amount of short text messages are continuously and promptly exchanged with personal opinions, descriptions of the natural events and their corresponding consequences. The analysis of these large and complex data could help policy-makers to better understand the event as well as to set priorities. However, the correct configuration of the tweet mining process is still challenging due to variable data distribution and the availability of a large number of algorithms with different specific parameters. The analyst need to perform a large number of experiments to identify the best configuration for the overall knowledge discovery process. Innovative, scalable, and parameter-free solutions need to be explored to streamline the analytics process. This paper presents an enhanced version of PASTA (a distributed self-tuning engine) applied to a crisis tweet collection to group a corpus of tweets into cohesive and well-separated clusters with minimal analyst intervention. Experimental results performed on real data collected during natural disasters show the effectiveness of PASTA in discovering interesting groups of correlated tweets without selecting neither the algorithms nor their parameters.
机译:在过去的20年中,由于重大的气候变化,自然灾害变得更加频繁。这些自然事件在诸如Twitter之类的社交网络上引起了激烈的辩论,大量的短消息不断地与个人意见,对自然事件的描述及其相应后果进行交换。对这些庞大而复杂的数据进行分析可以帮助决策者更好地了解事件并确定优先级。但是,由于可变的数据分布以及具有不同特定参数的大量算法的可用性,推文挖掘过程的正确配置仍然具有挑战性。分析人员需要进行大量实验,才能为整个知识发现过程确定最佳配置。需要探索创新,可扩展且无参数的解决方案,以简化分析过程。本文介绍了PASTA的增强版本(分布式自调整引擎),该增强版应用于危机推文集合,以最少的分析员干预将推文集划分为有凝聚力且分隔良好的集群。对自然灾害期间收集的真实数据进行的实验结果表明,PASTA在发现有趣的相关推文组时有效,而无需选择算法或算法参数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号