Time Series Epenthesis: Clustering Time Series Streams Requires Ignoring Some Data

机译：时间序列理论：将时间序列流聚类需要忽略一些数据

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Given the pervasiveness of time series data in all human endeavors, and the ubiquity of clustering as a data mining application, it is somewhat surprising that the problem of time series clustering from a single stream remains largely unsolved. Most work on time series clustering considers the clustering of individual time series, e.g., gene expression profiles, individual heartbeats or individual gait cycles. The few attempts at clustering time series streams have been shown to be objectively incorrect in some cases, and in other cases shown to work only on the most contrived datasets by carefully adjusting a large set of parameters. In this work, we make two fundamental contributions. First, we show that the problem definition for time series clustering from streams currently used is inherently flawed, and a new definition is necessary. Second, we show that the Minimum Description Length (MDL) framework offers an efficient, effective and essentially parameter-free method for time series clustering. We show that our method produces objectively correct results on a wide variety of datasets from medicine, zoology and industrial process analyses.

机译：考虑到时间序列数据在所有人类活动中的普遍性，以及将聚类作为数据挖掘应用程序的普遍性，令人惊讶的是，单个流中的时间序列聚类问题仍未解决。时间序列聚类的大多数工作都考虑单个时间序列的聚类，例如基因表达谱，单个心跳或单个步态周期。在某些情况下，已证明对时间序列流进行聚类的几次尝试在客观上是不正确的，而在其他情况下，通过仔细调整大量参数，这些结果仅对最人为设计的数据集有效。在这项工作中，我们做出了两个基本贡献。首先，我们表明从当前使用的流对时间序列聚类进行问题定义固有地存在缺陷，因此有必要提供新的定义。其次，我们表明最小描述长度（MDL）框架为时间序列聚类提供了一种有效，有效且基本无参数的方法。我们表明，我们的方法在医学，动物学和工业过程分析的各种数据集上产生客观正确的结果。

著录项

来源
《11th IEEE International Conference on Data Mining》|2011年|p.547-556|共10页
会议地点 Vancouver(CA)
作者
Rakthanmanon Thanawin; Keogh Eamonn J.; Lonardi Stefano; Evans Scott;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类 TP274.2;
关键词
MDL; clustering; time series;

机译：MDL;聚类;时间序列;

相似文献

外文文献
中文文献
专利

1. TimeSeriesStreaming.vi: LabVIEW program for reliable data streaming of large analog time series [J] . Czerwinski F., Oddershede L.B. Computer physics communications . 2011,第2期

机译：TimeSeriesStreaming.vi：LabVIEW程序，用于大型模拟时间序列的可靠数据流传输
2. TimeSeriesStreaming.vi: LabVIEW program for reliable data streaming of large analog time series [J] . Czerwinski F., Oddershede L.B. Computer physics communications . 2011,第2期

机译：TimeSeriesStreaming.vi：LabVIEW程序，用于大型模拟时间序列的可靠数据流传输
3. TS-stream: clustering time series on data streams [J] . Cassio M. M. Pereira, Rodrigo F. de Mello Journal of Intelligent Information Systems . 2014,第3期

机译：TS流：将数据流上的时间序列聚类
4. Time Series Epenthesis: Clustering Time Series Streams Requires Ignoring Some Data [C] . Rakthanmanon Thanawin, Keogh Eamonn J., Lonardi Stefano, IEEE International Conference on Data Mining . 2011

机译：时间序列术语：聚类时间序列流需要忽略一些数据
5. Statistical Modeling of Carbon Dioxide and Cluster Analysis of Time Dependent Information: Lag Target Time Series Clustering, Multi-Factor Time Series Clustering, and Multi-Level Time Series Clustering [D] . Kim, Doo Young. 2016

机译：二氧化碳的统计建模和时间相关信息的聚类分析：滞后目标时间序列聚类，多因素时间序列聚类和多级时间序列聚类
6. Whole Time Series Data Streams Clustering: Dynamic Profiling of the Electricity Consumption [O] . Krzysztof Gajowniczek, Marcin Bator, Tomasz Ząbkowski 2020

机译：整个时间序列数据流聚类：电力消耗的动态分析
7. Time Series Epenthesis: Clustering Time Series Streams Requires Ignoring Some Data [O] . Thanawin Rakthanmanon, Eamonn J. Keogh, Stefano Lonardi, 2012

机译：时间序列Epenthesis：聚类时间序列流需要忽略某些数据

Time Series Epenthesis: Clustering Time Series Streams Requires Ignoring Some Data

摘要

著录项

相似文献

相关主题

期刊订阅