【24h】

INSIGHT: Efficient and Effective Instance Selection for Time-Series Classification

机译:洞察力:时间序列分类的高效实例选择

获取原文

摘要

Time-series classification is a widely examined data mining task with various scientific and industrial applications. Recent research in this domain has shown that the simple nearest-neighbor classifier using Dynamic Time Warping (DTW) as distance measure performs exceptionally well, in most cases outperforming more advanced classification algorithms. Instance selection is a commonly applied approach for improving efficiency of nearest-neighbor classifier with respect to classification time. This approach reduces the size of the training set by selecting the best representative instances and use only them during classification of new instances. In this paper, we introduce a novel instance selection method that exploits the hubness phenomenon in time-series data, which states that some few instances tend to be much more frequently nearest neighbors compared to the remaining instances. Based on hubness, we propose a framework for score-based instance selection, which is combined with a principled approach of selecting instances that optimize the coverage of training data. We discuss the theoretical considerations of casting the instance selection problem as a graph-coverage problem and analyze the resulting complexity. We experimentally compare the proposed method, denoted as INSIGHT, against FastAWARD, a state-of-the-art instance selection method for time series. Our results indicate substantial improvements in terms of classification accuracy and drastic reduction (orders of magnitude) in execution times.
机译:时间序列分类是一项广泛研究的数据挖掘任务,具有各种科学和工业应用。在该领域的最新研究表明,使用动态时间规整(DTW)作为距离度量的简单最近邻分类器具有出色的性能,在大多数情况下,它们的性能优于更高级的分类算法。实例选择是一种常用的方法,可以提高最近邻居分类器相对于分类时间的效率。该方法通过选择最佳代表性实例并在新实例分类期间仅使用它们来减小训练集的大小。在本文中,我们介绍了一种新颖的实例选择方法,该方法利用了时序数据中的中枢现象,该方法指出,与其余实例相比,一些实例往往更频繁地成为最近的邻居。基于中心度,我们提出了一个基于分数的实例选择框架,该框架与选择实例的原则方法相结合,可以优化训练数据的覆盖范围。我们讨论了将实例选择问题转换为图覆盖问题的理论考虑,并分析了由此产生的复杂性。我们通过实验将提出的方法INSIGHT与FastAWARD(一种用于时间序列的最新实例选择方法)进行了比较。我们的结果表明,在分类准确度和执行时间上都大幅度减少(数量级)方面,有了实质性的改进。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号