首页> 外文会议>International Conference on Web Information Systems Engineering >Time Series Data Cleaning Based on Dynamic Speed Constraints
【24h】

Time Series Data Cleaning Based on Dynamic Speed Constraints

机译:基于动态速度约束的时间序列数据清洁

获取原文

摘要

Errors are ubiquitous in time series data as sensors are often unstable. Existing approaches based on constraints can achieve good data repair effect on abnormal values. The constraint typically refers to the speed range of data changes. If the speed of data changes is not in the range, it is identified as abnormal data violating the constraint and needs repair, like if the oil consumption per hour of a sedan is negative or greater than 15 gallons, it is probably abnormal data. However, existing methods are only limited to specific type of data whose value change speed is stable. They will be inefficient when handling the data stream with sharp fluctuation because their constraints based on priori, fixed speed range might miss most abnormal data. To make up the gap in this scenario, an online cleaning approach based on dynamic speed constraints is proposed for time series data with fluctuating value change speed. The dynamic constraints proposed is not determined in advance but self-adaptive as data changes over time. A dual window mechanism is devised to transform the global optimum of data repair problem to local optimum problem. The classic minimum change principle and median principle are introduced for data repair. With respect to repair invalidation of minimum change principle facing consecutive data points violating constraints, we propose to use the boundary of the corresponding candidate repair set as repair strategy. Extensive experiments on real datasets demonstrate that the proposed approach can achieve higher repair accuracy than traditional approaches.
机译:由于传感器通常不稳定,错误在时间序列数据中普遍存在。基于约束的现有方法可以实现对异常值的良好数据修复效果。约束通常是指数据变化的速度范围。如果数据的速度变化不在范围内,则它被识别为违反约束和需要修复的异常数据,例如如果轿车的每小时的油耗为负或大于15加仑,则可能是异常的数据。但是,现有方法仅限于特定类型的数据,其值变化速度稳定。在处理具有剧烈波动的数据流时,它们将效率低下,因为它们的约束基于先验的固定速度范围可能会错过最异常的数据。为了在这种情况下构成差距,提出了一种基于动态速度约束的在线清洁方法,用于时间序列数据,其具有波动值变化速度。所提出的动态约束未预先确定,但自适应随着时间的变化而自适应。设计了一种双窗机制,以将数据修复问题的全球最佳变换为局部最佳问题。介绍了经典的最小变化原理和中位数原理进行数据维修。关于修复失效的最小变化原理面对连续的数据点违反约束,我们建议使用相应的候选修复设定的边界作为修复策略。对实时数据集的广泛实验表明,所提出的方法可以比传统方法实现更高的修复准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号