...
首页> 外文期刊>Data mining and knowledge discovery >Interpretable time series classification using linear models and multi-resolution multi-domain symbolic representations
【24h】

Interpretable time series classification using linear models and multi-resolution multi-domain symbolic representations

机译:使用线性模型和多分辨率多域符号表示的可解释的时间序列分类

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The time series classification literature has expanded rapidly over the last decade, with many new classification approaches published each year. Prior research has mostly focused on improving the accuracy and efficiency of classifiers, with interpretability being somewhat neglected. This aspect of classifiers has become critical for many application domains and the introduction of the EU GDPR legislation in 2018 is likely to further emphasize the importance of interpretable learning algorithms. Currently, state-of-the-art classification accuracy is achieved with very complex models based on large ensembles (COTE) or deep neural networks (FCN). These approaches are not efficient with regard to either time or space, are difficult to interpret and cannot be applied to variable-length time series, requiring pre-processing of the original series to a set fixed-length. In this paper we propose new time series classification algorithms to address these gaps. Our approach is based on symbolic representations of time series, efficient sequence mining algorithms and linear classification models. Our linear models are as accurate as deep learning models but are more efficient regarding running time and memory, can work with variable-length time series and can be interpreted by highlighting the discriminative symbolic features on the original time series. We advance the state-of-the-art in time series classification by proposing new algorithms built using the following three key ideas: (1) Multiple resolutions of symbolic representations: we combine symbolic representations obtained using different parameters, rather than one fixed representation (e.g., multiple SAX representations); (2) Multiple domain representations: we combine symbolic representations in time (e.g., SAX) and frequency (e.g., SFA) domains, to be more robust across problem types; (3) Efficient navigation in a huge symbolic-words space: we extend a symbolic sequence classifier (SEQL) to work with multiple sym
机译:时间序列分类文献在过去十年中迅速扩大,每年发布了许多新的分类方法。现有研究主要集中在提高分类机的准确性和效率,具有令人忽略的可忽视性。分类器的这一方面对于许多应用领域来说已经至关重要,并在2018年引入欧盟GDPR立法可能进一步强调可解释的学习算法的重要性。目前,基于大型集合(COTE)或深神经网络(FCN)的模型,实现了最先进的分类准确性。这些方法在任一时或空间方面不有效,难以解释并且不能应用于可变长度的时间序列,要求将原始系列预处理到设定的固定长度。在本文中,我们提出了新的时间序列分类算法来解决这些差距。我们的方法是基于时间序列,高效序列挖掘算法和线性分类模型的象征性表示。我们的线性模型与深度学习模型一样准确,但在运行时间和内存方面更有效,可以使用可变长度时间序列,可以通过突出显示原始时间序列上的识别符号特征来解释。我们通过提出使用以下三个关键思路构建的新算法来推进最先进的序列分类:(1)符号表示的多个分辨率:我们组合使用不同参数获得的符号表示,而不是一个固定表示(例如,多个SAX表示); (2)多域表示:我们将符号表示及时(例如,SAX)和频率(例如,SFA)域组合,跨问题类型更加强大; (3)在庞大的符号字空间中有效导航:我们扩展了一个符号序列分类器(SEQL)以与多个Sym工作

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号