...
首页> 外文期刊>Journal of Information Science >SMS spam filtering and thread identification using bi-level text classification and clustering techniques
【24h】

SMS spam filtering and thread identification using bi-level text classification and clustering techniques

机译:使用双层文本分类和聚类技术的SMS垃圾邮件过滤和线程识别

获取原文
获取原文并翻译 | 示例
           

摘要

SMS spam detection is an important task where spam SMS messages are identified and filtered. As greater numbers of SMS messages are communicated every day, it is very difficult for a user to remember and correlate the newer SMS messages received in context to previously received SMS. SMS threads provide a solution to this problem. In this work the problem of SMS spam detection and thread identification is discussed and a state of the art clustering-based algorithm is presented. The work is planned in two stages. In the first stage the binary classification technique is applied to categorize SMS messages into two categories namely, spam and non-spam SMS; then, in the second stage, SMS clusters are created for non-spam SMS messages using non-negative matrix factorization and K-means clustering techniques. A threading-based similarity feature, that is, time between consecutive communications, is described for the identification of SMS threads, and the impact of the time threshold in thread identification is also analysed experimentally. Performance parameters like accuracy, precision, recall and F-measure are also evaluated. The SMS threads identified in this proposed work can be used in applications like SMS thread summarization, SMS folder classification and other SMS management-related tasks.
机译:SMS垃圾邮件检测是一项重要任务,可识别并过滤垃圾短信。随着每天传送更多数量的SMS消息,用户很难记住上下文中接收到的较新SMS消息并将其与先前接收到的SMS相关联。 SMS线程可解决此问题。在这项工作中,讨论了SMS垃圾邮件检测和线程识别的问题,并提出了一种基于聚类的最新算法。这项工作分两个阶段进行。在第一阶段,采用二进制分类技术将SMS消息分为两类,即垃圾邮件和非垃圾短信。然后,在第二阶段,使用非负矩阵分解和K均值聚类技术为非垃圾短信消息创建SMS集群。描述了基于线程的相似性功能,即连续通信之间的时间,用于识别SMS线程,并且还通过实验分析了时间阈值对线程识别的影响。还评估了性能参数,如准确性,准确性,召回率和F量度。在这项提议的工作中确定的SMS线程可以用于SMS线程摘要,SMS文件夹分类和其他与SMS管理相关的任务之类的应用程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号