首页> 外文会议>IEEE International Conference on Data Mining Workshops >Detecting Outliers in Streaming Time Series Data from ARM Distributed Sensors
【24h】

Detecting Outliers in Streaming Time Series Data from ARM Distributed Sensors

机译:从ARM分布式传感器检测流时间序列数据中的异常值

获取原文

摘要

The Atmospheric Radiation Measurement (ARM) Data Center at ORNL collects data from a number of permanent and mobile facilities around the globe. The data is then ingested to create high level scientific products. High frequency streaming measurements from sensors and radar instruments at ARM sites require high degree of accuracy to enable rigorous study of atmospheric processes. Outliers in collected data are common due to instrument failure or extreme weather events. Thus, it is critical to identify and flag them. We employed multiple univariate, multivariate and time series techniques for outlier detection methods and studied their effectiveness. First, we examined Pearson correlation coefficient which is used to measure the pairwise correlations between variables. Singular Spectrum Analysis (SSA) was applied to detect outliers by removing the anticipated annual and seasonal cycles from the signal to accentuate anomalies. K-means was applied for multivariate examination of data from collection of sensors to identify any deviation from expected and known patterns and identify abnormal observation. The Pearson correlation coefficient, SSA and K-means methods were later combined together in a framework to detect outliers through a range of checks. We applied the developed method to data from meteorological sensors at ARM Southern Great Plains site and validated against existing database of known data quality issues.
机译:ORNL的大气辐射测量(ARM)数据中心从全球许多永久性和移动性设施中收集数据。然后提取数据以创建高级科学产品。 ARM站点的传感器和雷达仪器进行的高频流测量需要高度的准确性,以能够对大气过程进行严格的研究。由于仪器故障或极端天气事件,收集到的数据中的异常值很常见。因此,识别并标记它们至关重要。我们采用了多种单变量,多元和时间序列技术作为离群值检测方法,并研究了其有效性。首先,我们检查了Pearson相关系数,该系数用于测量变量之间的成对相关。奇异频谱分析(SSA)用于通过从信号中消除预期的年度和季节周期以突出异常来检测异常值。 K均值用于对传感器收集的数据进行多变量检查,以识别与预期和已知模式的任何偏差并识别异常观察。皮尔逊相关系数,SSA和K-means方法随后在框架中组合在一起,可以通过一系列检查来检测异常值。我们将开发的方法应用于ARM Southern Great Plains站点的气象传感器数据,并针对已知数据质量问题的现有数据库进行了验证。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号