【24h】

Classifying E-Mails Via Support Vector Machine

机译:通过支持向量机进行分类电子邮件

获取原文

摘要

For addressing the growing problem of junk E-mail on the Internet, this paper proposes an effective E-mail classifying technique. Our work handles E-mail messages as semi-structured documents consisting of a set of fields with predefined semantics and a number of variable length free-text contents. The main contributions of this paper include the following: First, we present a Support Vector Machine (SVM) based model that incorporates the Principal Component Analysis (PCA) technique to reduce the data in terms of size and dimensionality of the input feature space. As a result, the input data become classifiable with fewer features, and the training process has faster convergence speed. Second, we build the classification model using both the -support vector machine and v-support vector machine algorithms. Various control parameters for performance tuning are studied in an extensive set of experiments. The results of our performance evaluation indicate that the proposed technique is effective in E-mail classification.
机译:为了解决互联网上的垃圾电子邮件不断增长的问题,提出了一种有效的电子邮件分类技术。我们的工作将电子邮件处理为半结构化文档,包括具有预定义语义的一组字段和许多可变长度的自由文本内容。本文的主要贡献包括以下内容:首先,我们介绍了一种基于支持向量机(SVM)的模型,该模型包含主成分分析(PCA)技术,以根据输入特征空间的大小和维度来减少数据。因此,输入数据具有较少的功能,培训过程具有更快的收敛速度。其次,我们使用-support向量机和V-support向量机算法构建分类模型。在广泛的实验中研究了用于性能调谐的各种控制参数。我们的绩效评估结果表明所提出的技术在电子邮件分类中是有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号