首页> 外文会议>HCT Information Technology Trends >Chronological Word Frequency Analysis of the 2017 Iran-Iraq Earthquake
【24h】

Chronological Word Frequency Analysis of the 2017 Iran-Iraq Earthquake

机译:2017年伊朗 - 伊拉克地震的年代词汇分析

获取原文

摘要

The aim of this paper is to understand the chronological communication behavior of Twitter users immediately after the 2017 Iran-Iraq earthquake by identifying word frequencies and an overall sentiment analysis of their tweets. A total of 20000 tweets were collected over three time periods. In the analysis of the tweets, all retweets were removed. In text mining, a common task is the inspection of word frequencies. The higher the frequency of a word in a time-period the more important that word is in understanding the tweets in that time-period. To quantify the similarity of these sets of frequencies, the Person's product-moment correlation was calculated. The distribution of the word frequencies was plotted for each time-period. To measure how important a word is in a tweet in a particular time-period the term frequency inverse document frequency, tf-idf, was calculated. The findings show that the plotted distributions are typical of a corpus following Zipf's law.
机译:本文的目的是通过识别Word频率和对其推文的整体情感分析,了解2017伊朗 - 伊拉克地震之后立即了解Twitter用户的时间顺序通信行为。共收集了20000年的推文,超过了三次时间。在对推特的分析中,所有转派都被删除。在文本挖掘中,共同任务是检验词频率。在一个时间周期中,单词的频率越高,这个时期的Word在理解推文中的频率越重要。为了量化这些频率集的相似性,计算了该人的产品矩相关性。每个时间周期绘制了单词频率的分布。为了测量单词在特定时间周期中在Tweet中的重要性,计算术语频率逆文档频率TF-IDF。调查结果表明,绘制的分布是ZIPF定律之后的语料库的典型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号