Detection of Arabic Spam Tweets Using Word Embedding and Machine Learning

机译：使用词嵌入和机器学习检测阿拉伯垃圾邮件

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Twitter has become one of the most popular social networking platforms for sharing activities and opinions. In this study, we explore the idea of applying word embedding based features with machine-learning techniques to detect Arabic spam tweets. In addition, the effects of text domain of the collected corpus to learn word embedding is analyzed. This is evaluated using a publicly available dataset of 3503 tweets alongside with three popular classifiers for binary classification, namely: Naïve Bayes, Decision trees and SVM. The experimental results reveal that the proposed method outperforms the baseline approach in distinguishing between machine-generated tweets and human-generated tweets. An accuracy rate of 87.33% is achieved using skip-gram word2vec technique with SVM.

机译：Twitter已成为共享活动和观点的最受欢迎的社交网络平台之一。在这项研究中，我们探索了将基于单词嵌入的功能与机器学习技术一起使用以检测阿拉伯垃圾邮件推文的想法。此外，分析了收集的语料库的文本域对学习单词嵌入的影响。使用公开的3503条推文数据集以及三个流行的二进制分类器（朴素贝叶斯，决策树和SVM）对它进行评估。实验结果表明，该方法在区分机器生成的推文和人类生成的推文方面优于基线方法。使用带有SVM的skip-gram word2vec技术，可以达到87.33％的准确率。

著录项

来源
《International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies》|2018年|1-5|共5页
会议地点 Sakhier(BH)
作者
Sadam Al-Azani; El-Sayed M. El-Alfy;
展开▼
作者单位

Department of Information and Computer Science King Fahd University of Petroleum and Minerals Dhahran 31261 Kingdom of Saudi Arabia;

Department of Information and Computer Science King Fahd University of Petroleum and Minerals Dhahran 31261 Kingdom o;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Support vector machines; Feature extraction; Twitter; Microsoft Windows; Internet; Technological innovation; Informatics;

机译：支持向量机；特征提取;推特;微软Windows;互联网;技术创新；信息学;
入库时间 2022-08-26 14:35:52

相似文献

外文文献
中文文献
专利

1. Enhancing Contextualised Language Models with Static Character and Word Embeddings for Emotional Intensity and Sentiment Strength Detection in Arabic Tweets [J] . Abdullah I. Alharbi, Phillip Smith, Mark Lee Procedia Computer Science . 2021,第a期

机译：增强具有静态字符和Word Embeddings的语境化语言模型，用于阿拉伯语推文中的情绪强度和情绪强度检测
2. A Deep Learning Framework for Automatic Detection of Hate Speech Embedded in Arabic Tweets [J] . Rehab Duwairi, Amena Hayajneh, Muhannad Quwaider Arabian Journal for Science and Engineering . 2021,第4期

机译：用于自动检测嵌入阿拉伯语推文的仇恨语音的深度学习框架
3. A Performance Evaluation of Machine Learning-Based Streaming Spam Tweets Detection [J] . Chao Chen, Jun Zhang, Yi Xie, Computational Social Systems, IEEE Transactions on . 2015,第3期

机译：基于机器学习的流式垃圾邮件鸣叫检测性能评估
4. Detection of Arabic Spam Tweets Using Word Embedding and Machine Learning [C] . Sadam Al-Azani, El-Sayed M. El-Alfy International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies . 2018

机译：使用Word嵌入和机器学习检测阿拉伯垃圾邮件推文
5. SMS Spam Detection Framework Using Machine Learning Algorithms [D] . Sudanagunta, Saipriya. 2020

机译：SMS垃圾邮件检测框架使用机器学习算法
6. Developing Electron Microscopy Tools for Profiling Plasma Lipoproteins Using Methyl Cellulose Embedment Machine Learning and Immunodetection of Apolipoprotein B and Apolipoprotein(a) [O] . Yvonne Giesecke, Samuel Soete, Katarzyna MacKinnon, 2020

机译：使用甲基纤维素嵌入机器学习和载脂蛋白（A）的机器学习和免疫检测开发用于分析血浆脂蛋白的电子显微镜工具
7. Corrections to "A performance evaluation of machine learning based streaming spam tweets detection" [O] . Chen Chao, Zhang Jun, Xie Yi, 2016

机译：对“基于机器学习的流式垃圾邮件推文检测性能评估”的更正

Detection of Arabic Spam Tweets Using Word Embedding and Machine Learning

摘要

著录项

相似文献

相关主题

期刊订阅