首页> 外文期刊>Quality Control, Transactions >Fast Piecewise Polynomial Fitting of Time-Series Data for Streaming Computing
【24h】

Fast Piecewise Polynomial Fitting of Time-Series Data for Streaming Computing

机译:用于流媒体计算的时间序列数据的快速分段多项式拟合

获取原文
获取原文并翻译 | 示例
           

摘要

Streaming computing attracts intense attention because of the demand for massive data analyzing in real-time. Due to unbounded and continuous input, the volume of streaming data is so high that all the data cannot be permanently stored. Piecewise polynomial fitting is a popular data compression method that approximately represents the raw data stream with multiple polynomials. The polynomial coefficients corresponding to the best-fitting curve can be calculated by the method of least squares, which minimizes the sum of the squared residuals between observed and fitted values. However, built on several matrix calculations, the method of least squares always leads to high time complexity and is difficult to be applied to streaming computing. This paper puts forward a fast piecewise polynomial fitting for time-series data in streaming computing. The input data stream is dynamically segmented according to a given residual bound. Meanwhile, the data points in each segment are fitted using an improved polynomial fitting method, which has less time overhead than general polynomial fitting by reusing the intermediate calculation results. Experimental results on four time-series datasets show that our algorithm can achieve the highest speedup to the general piecewise polynomial fitting of 2.82x for periodically sampled time-series data and 1.85x for aperiodically sampled time-series data, without affecting the compression ratio and fitting accuracy. Moreover, the event-time latency comparison in a streaming environment indicates that the improved method can endure higher throughput than general piecewise polynomial fitting with the same latency.
机译:流媒体计算因实时数据分析的大规模数据的需求而引起强烈的关注。由于无界和连续输入,流数据的体积非常高,所有数据都无法永久存储。分段多项式拟合是一种流行的数据压缩方法,大致表示具有多个多项式的原始数据流。可以通过最小二乘法计算对应于最佳拟合曲线的多项式系数,这使得最小化观察和装配值之间的平方残差的总和。然而,基于多个矩阵计算,最小二乘的方法总是导致高时间复杂度,并且难以应用于流计算。本文提出了一种快速分段多项式拟合,用于流化计算中的时间序列数据。根据给定的残差绑定,输入数据流是动态分割的。同时,使用改进的多项式拟合方法装配每个段中的数据点,其通过重用中间计算结果而具有比一般多项式拟合更少的时间开销。四个时间序列数据集的实验结果表明,我们的算法可以将最高加速到2.82x的一般分段多项式拟合的最高加速,用于定期采样的时间序列数据和1.85倍进行非周期性采样的时间序列数据,而不会影响压缩比和拟合精度。此外,流环境中的事件时间延迟比较表明,改进的方法可以持久地吞吐量,而不是具有相同延迟的普通分段多项式拟合。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号