首页> 外文期刊>Journal of Intelligent Information Systems >TS-stream: clustering time series on data streams
【24h】

TS-stream: clustering time series on data streams

机译:TS流:将数据流上的时间序列聚类

获取原文
获取原文并翻译 | 示例
           

摘要

The current ability to produce massive amounts of data and the impossibility in storing it motivated the development of data stream mining strategies. Despite the proposal of many techniques, this research area still lacks in approaches to mine data streams composed of multiple time series, which has applications in finance, medicine and science. Most of the current techniques for clustering streaming time series have a serious limitation in their similarity measure, which are based on the Pearson correlation. In this paper, we show the Pearson correlation is not capable of detecting similarities even for classic time series models, such as those by Box and Jenkins. This limitation motivated our proposal to cluster streaming time series based on their generating functions, which is achieved by considering features obtained using descriptive measures, such as Auto Mutual Information, the Hurst Exponent and several others. We present a new tree-based clustering algorithm, entitled TS-Stream, which uses the extracted features to produce partitions in better accordance to the time series generating functions. Experiments with synthetic data sets confirm TS-Stream outperforms ODAC, currently the most popular technique, in terms of clustering quality. Using real financial time series from the NYSE and NASDAQ, we conducted stock trading simulations employing TS-Stream to support the creation of diversified investment portfolios. Results confirmed TS-Stream increased the monetary returns in several orders of magnitude when compared to trading strategies simply based on the Moving Average Convergence Divergence financial indicator.
机译:当前产生大量数据的能力以及无法存储数据的能力激发了数据流挖掘策略的发展。尽管提出了许多技术建议,但该研究领域仍然缺乏用于挖掘由多个时间序列组成的数据流的方法,该方法在金融,医学和科学中都有应用。目前,大多数基于流时间序列进行聚类的技术在基于Pearson相关性的相似性度量中都存在严重的局限性。在本文中,我们证明了即使对于经典的时间序列模型(例如Box和Jenkins的模型),Pearson相关性也无法检测相似性。这种局限性促使我们建议基于流时间序列的生成函数对流时间序列进行聚类,这是通过考虑使用描述性度量(例如自动互信息,Hurst指数等)获得的特征来实现的。我们提出了一种名为TS-Stream的基于树的新聚类算法,该算法使用提取的特征来更好地根据时间序列生成函数生成分区。综合数据集的实验证实,在聚类质量方面,TS-Stream优于ODAC(目前最流行的技术)。使用来自纽约证券交易所和纳斯达克的真实财务时间序列,我们使用TS-Stream进行了股票交易模拟,以支持创建多元化的投资组合。结果证实,与仅基于移动平均趋同散度财务指标的交易策略相比,TS-Stream的货币收益提高了几个数量级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号