...
首页> 外文期刊>Knowledge-Based Systems >Enhancing learning algorithms to support data with short sequence features by automated feature discovery
【24h】

Enhancing learning algorithms to support data with short sequence features by automated feature discovery

机译:通过自动特征发现增强学习算法以支持具有短序列特征的数据

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In this paper, we propose a VECtor Discovery approach, called VECDIS, which enhances the learning performance of existing classifiers directly from various data types and is able to discover features made of multiple feature types for explanatory purposes. The data types could be combinations of multivariate, short time-series or short sequential data. The features in the dataset could have single item or/and a list of ordered items of different sizes. The present approach allows handling raw vector data without prior manipulation (i.e., preprocessing). The discovered features are made of vector and non-vector mathematical relations. The algorithm generates new vector features and mathematical expression features that are transmitted or exchanged with previously generated features, to the next iterative step. The approach is able to search and automatically discover thousands of different features (sequence manipulation), performed on the sequence features. We performed large number of experiments with various synthetic and simulated datasets and with a wide range of classifiers. The results show that VECDIS enhanced significantly the classification performance of existing classifiers to handle datasets having multiple feature types with short sequence features. Nevertheless, there is no guarantee that the mathematical library as presented in this paper is suitable to all sequence datasets and would lead to discovering a valuable feature set. Therefore, VECDIS enables expanding or exchanging the mathematical library as desire.
机译:在本文中,我们提出了一种称为VECDIS的VECtor发现方法,该方法可直接从各种数据类型中增强现有分类器的学习性能,并且能够出于解释目的而发现由多种特征类型构成的特征。数据类型可以是多元,短时间序列或短序列数据的组合。数据集中的要素可以具有单个项目或/和不同大小的有序项目列表。本方法允许处理原始矢量数据而无需事先操纵(即,预处理)。发现的特征由矢量和非矢量数学关系组成。该算法生成新的矢量特征和数学表达式特征,这些特征与先前生成的特征进行传输或交换,以进行下一个迭代步骤。该方法能够搜索并自动发现对序列特征执行的数千种不同特征(序列操纵)。我们使用各种合成和模拟数据集以及广泛的分类器进行了大量实验。结果表明,VECDIS显着提高了现有分类器的分类性能,以处理具有短序列特征的多种特征类型的数据集。但是,不能保证本文中提供的数学库适用于所有序列数据集,并且会导致发现有价值的特征集。因此,VECDIS可以根据需要扩展或交换数学库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号