...
首页> 外文期刊>Neural computing & applications >The hybrid ant colony optimization and ensemble method for solving the data stream e-mail foldering problem
【24h】

The hybrid ant colony optimization and ensemble method for solving the data stream e-mail foldering problem

机译:用于解决数据流电子邮件折叠问题的混合蚁群优化和集合方法

获取原文
获取原文并翻译 | 示例
           

摘要

The e-mail foldering problem is a special classification problem. It concerns a situation where e-mail users create new folders and, at the same time, stop using some of the folders created in the past. Additionally, messages arrive in the system at different time stamps. This article proposes a novel approach to ant colony optimization adapted to data stream analysis. The article is related to the revision of the ant colony optimization algorithm in the e-mail foldering problem and the proposition of a new solution adapted to the data stream. The goal of this work is to allow the classification of messages arriving at the system as data packages; however, due to the large number of decision classes (folders in the inbox), successive packages lead to a large concept drift. To assure the stability of the algorithm, an approach based on the memory being represented as a pheromone trail is introduced. This concept is known from the ant colony optimization methods. At the same time, multiple numbers of classifiers (similar to an ensemble method) are included. The proposed approach was tested on real-world data from the Enron e-mail dataset. An analysis of the two proposed methods related to the data stream was proposed. Both methods were compared with the methods used in the literature. The results achieved, in terms of the accuracy as well as the stability, confirm that (according to a statistical analysis) the proposed solutions are capable of better classifying e-mail messages derived from the system as data packages.
机译:电子邮件卷叠问题是一个特殊的分类问题。它涉及电子邮件用户创建新文件夹的情况,同时停止使用过去创建的一些文件夹。此外,消息在不同的时间戳到达系统。本文提出了一种新的蚁群优化方法,适用于数据流分析。该文章与电子邮件折叠问题中的蚁群优化算法的修订有关,以及适用于数据流的新解决方案的命题。这项工作的目标是允许分类到达系统作为数据包的信息;但是,由于大量决策类(收件箱中的文件夹),连续的包导致大概念漂移。为了确保算法的稳定性,介绍了一种基于存储器被表示为信息素路径的方法。从蚁群优化方法中已知这种概念。同时,包括多个数量的分类器(类似于集体方法)。所提出的方法是在来自enron电子邮件数据集的真实数据上测试的。提出了对与数据流相关的两个提出方法的分析。将两种方法与文献中使用的方法进行比较。就准确性以及稳定性而言,确认(根据统计分析)的结果,所提出的解决方案能够更好地将从系统派生的电子邮件分类为数据包。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号