首页> 外文期刊>Arabian Journal for Science and Engineering. Section A, Sciences >Intelligent Analysis of Arabic Tweets for Detection of Suspicious Messages
【24h】

Intelligent Analysis of Arabic Tweets for Detection of Suspicious Messages

机译:检测可疑信息的阿拉伯语推文的智能分析

获取原文
获取原文并翻译 | 示例
       

摘要

With the widespread use of messaging via social networks such as Twitter, Instagram, and Facebook, it is becoming imperative for researchers to devise intelligent systems for data analytics in the range of domains like business, health, communication, security, etc. The complex morphological and syntactic structure of Arabic sentences makes them difficult to analyze. This paper presents an intelligent system to analyze Arabic tweets for detecting suspicious messages. We acquired Arabic tweet data from micro-blogging social network Twitter via Twitter Streaming Application Programming Interface and save it in a required file format. The system tokenizes and preprocesses the tweet dataset. Manual labeling is performed on tweet dataset for suspicious (label 1) and not-suspicious (label 0) classes. The labeled tweet dataset is used to train a classifier using supervised machine learning algorithms for the detection of suspicious activities. During the testing phase, the system processes unlabeled tweet data and detects either it belongs to a suspicious or not-suspicious class.We tested the system using six supervised machine learning algorithms: (1) decision tree, (2) k-nearest neighbors, (3) linear discriminant algorithm, (4) support vector machine, (5) artificial neural networks, and (6) long short-term memory networks. A comparative analysis in terms of accuracy, execution time, and confusion matrices of the six classifiers is presented. The execution speed of ANN is lowest. In terms of predicting correct results, the SVM performs best among all the classifiers and yields 86.72% mean accuracy. The major outcomes of this work are development of labeled dataset of Arabic tweets, an intelligent behavior analysis of tweets using six machine learning algorithms to detect suspicious messages, a comparative analysis of six machine learning algorithms, and a development of a statistical benchmark that can be used for future studies about the detection of crimes on social media.
机译:通过广泛使用消息传递,通过诸如Twitter,Instagram和Facebook等社交网络,研究人员在商业,健康,沟通,安全等范围内为数据分析设计智能系统制定了智能系统。复杂的形态学阿拉伯语句子的句法结构使他们难以分析。本文介绍了一个智能系统,分析了用于检测可疑信息的阿拉伯语推文。我们通过Twitter流应用程序编程接口从微博社交网络推特中获取了阿拉伯语推文数据,并以所需的文件格式保存。系统授予和预处理推文数据集。对Tweet DataSet进行手动标签,用于可疑(标签1)和不可疑(标签0)类。标记的Tweet DataSet用于使用监督机器学习算法训练分类器以检测可疑活动。在测试阶段期间,系统处理未标记的推文数据并检测它属于可疑或不可疑的类。我们使用六个监督机器学习算法测试了系统:(1)决策树,(2)k最近邻居, (3)线性判别算法,(4)支持向量机,(5)人工神经网络,(6)长期短期内存网络。提出了六分类器的准确性,执行时间和混淆矩阵方面的比较分析。 ANN的执行速度最低。在预测正确的结果方面,SVM在所有分类器中表现最佳,并产生86.72%的平均准确性。这项工作的主要结果是开发标签的阿拉伯语推文数据集,使用六种机器学习算法进行推文的智能行为分析,以检测可疑信息,六种机器学习算法的比较分析以及可以成为统计基准的发展用于未来关于社交媒体犯罪的研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号