首页> 外文期刊>International journal of knowledge discovery in bioinformatics >Spam Detection on Social Media Using Semantic Convolutional Neural Network
【24h】

Spam Detection on Social Media Using Semantic Convolutional Neural Network

机译:使用语义卷积神经网络的社交媒体垃圾邮件检测

获取原文
获取原文并翻译 | 示例
       

摘要

This article describes how spam detection in the social media text is becoming increasing important because of the exponential increase in the spam volume over the network. It is challenging, especially in case of text within the limited number of characters. Effective spam detection requires more number of efficient features to be learned. In the current article, the use of a deep learning technology known as a convolutional neural network (CNN) is proposed for spam detection with an added semantic layer on the top of it. The resultant model is known as a semantic convolutional neural network (SCNN). A semantic layer is composed of training the random word vectors with the help of Word2vec to get the semantically enriched word embedding. WordNet and ConceptNet are used to find the word similar to a given word, in case it is missing in the word2vec. The architecture is evaluated on two corpora: SMS Spam dataset (UCI repository) and Twitter dataset (Tweets scrapped from public live tweets). The authors' approach outperforms the-state-of-the-art results with 98.65% accuracy on SMS spam dataset and 94.40% accuracy on Twitter dataset.
机译:本文介绍由于网络上垃圾邮件数量的指数增长,社交媒体文本中的垃圾邮件检测变得越来越重要。这是具有挑战性的,尤其是在字符数有限的情况下。有效的垃圾邮件检测需要学习更多有效的功能。在当前文章中,提出了使用称为卷积神经网络(CNN)的深度学习技术来进行垃圾邮件检测,并在其顶部添加了语义层。结果模型称为语义卷积神经网络(SCNN)。语义层由在Word2vec的帮助下训练随机单词向量以获得语义丰富的单词嵌入组成。 WordNet和ConceptNet用于查找与给定单词相似的单词,以防word2vec中缺少该单词。该体系结构基于两个语料库进行评估:SMS垃圾邮件数据集(UCI存储库)和Twitter数据集(从公共实时推文中删除的推文)。作者的方法优于最新结果,SMS垃圾邮件数据集的准确性为98.65%,Twitter数据集的准确性为94.40%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号