首页> 外文会议>International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies >Detection of Arabic Spam Tweets Using Word Embedding and Machine Learning
【24h】

Detection of Arabic Spam Tweets Using Word Embedding and Machine Learning

机译:使用词嵌入和机器学习检测阿拉伯垃圾邮件

获取原文

摘要

Twitter has become one of the most popular social networking platforms for sharing activities and opinions. In this study, we explore the idea of applying word embedding based features with machine-learning techniques to detect Arabic spam tweets. In addition, the effects of text domain of the collected corpus to learn word embedding is analyzed. This is evaluated using a publicly available dataset of 3503 tweets alongside with three popular classifiers for binary classification, namely: Naïve Bayes, Decision trees and SVM. The experimental results reveal that the proposed method outperforms the baseline approach in distinguishing between machine-generated tweets and human-generated tweets. An accuracy rate of 87.33% is achieved using skip-gram word2vec technique with SVM.
机译:Twitter已成为共享活动和观点的最受欢迎的社交网络平台之一。在这项研究中,我们探索了将基于单词嵌入的功能与机器学习技术一起使用以检测阿拉伯垃圾邮件推文的想法。此外,分析了收集的语料库的文本域对学习单词嵌入的影响。使用公开的3503条推文数据集以及三个流行的二进制分类器(朴素贝叶斯,决策树和SVM)对它进行评估。实验结果表明,该方法在区分机器生成的推文和人类生成的推文方面优于基线方法。使用带有SVM的skip-gram word2vec技术,可以达到87.33%的准确率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号