【24h】

Deep-learnt features for Twitter spam detection

机译:Twitter垃圾邮件检测的深度学习功能

获取原文
获取原文并翻译 | 示例

摘要

Twitter spam has become one of the most critical problems in recent years. Despite the efforts of researchers and security companies, the growing number of spam is not stopping. Machine learning is a very popular technology in network security and is also used for spam detection. An important step of applying machine learning for Twitter spam detection is feature engineering. Existing works mainly use URL based features, meta-data based features and social relation based features to detect spam tweets. All of the above mentioned works require human effort to extract features. More recently, deep learning has shed its light on automated feature engineering in extracting features from text. In this paper, we propose a new feature engineering mechanism based on a deep neural network trained using Bi-LSTM. We name the extracted features “deep-learnt features”. We compare our feature set with word2vec features and statistical features in the experimental evaluation. The results show that machine learning models trained using deep-learnt features can detect Twitter spam more accurately than models trained using word2vec features and statistcal features.
机译:Twitter垃圾邮件已成为近年来最严重的问题之一。尽管研究人员和安全公司做出了努力,但垃圾邮件的数量仍在不断增长。机器学习是网络安全中非常流行的技术,还用于垃圾邮件检测。将机器学习应用于Twitter垃圾邮件检测的重要步骤是功能工程。现有作品主要使用基于URL的功能,基于元数据的功能和基于社交关系的功能来检测垃圾邮件推文。上述所有作品都需要人工来提取特征。最近,深度学习在从文本中提取特征的自动特征工程上崭露头角。在本文中,我们提出了一种新的基于Bi-LSTM训练的深度神经网络的特征工程机制。我们将提取的特征命名为“深度学习特征”。在实验评估中,我们将功能集与word2vec功能和统计功能进行了比较。结果表明,与使用word2vec和统计功能训练的模型相比,使用深度学习功能训练的机器学习模型可以更准确地检测Twitter垃圾邮件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号