A Best of Both Worlds Approach to Complex, Efficient, Time Series Data Delivery

机译：最好的两个世界方法都可以复杂，高效，时间序列数据传递

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Point time series are a key data-type for the description of real or modelled environmental phenomena. Delivering this data in useful ways can be challenging when the data volume is large, when computational work (such as aggregation, subsetting, or re-sampling) needs to be performed, or when complex metadata is needed to place data in context for understanding. Some aspects of these problems are especially relevant to the environmental domain: large sensor networks measuring continuous environmental phenomena sampling frequently over long periods of time generate very large datasets, and rich metadata is often required to understand the context of observations. Nevertheless, timeseries data, and most of these challenges, are prevalent beyond the environmental domain, for example in financial and industrial domains. A review of recent technologies illustrates an emerging trend toward high performance, lightweight, databases specialized for time series data. These databases tend to have non-existent or minimalistic formal metadata capacities. In contrast, the environmental domain boasts standards such as the Sensor Observation Service (SOS) that have mature and comprehensive metadata models but existing implementations have had problems with slow performance. In this paper we describe our hybrid approach to achieve efficient delivery of large time series datasets with complex metadata. We use three subsystems within a single system-of-systems: a proxy (Python), an efficient time series database (InfluxDB) and a SOS implementation (52 North SOS). Together these present a regular SOS interface. The proxy processes standard SOS queries and issues them to the either 52 North SOS or to InfluxDB for processing. Responses are returned directly from 52 North SOS or indirectly from InfluxDB via Python proxy where they are processed into WaterML. This enables the scalability and performance advantages of the time series database to be married with the sophisticated metadata handling of SOS. Testing indicates that a recent version of 52 North SOS configured with a Postgres/PostGIS database performs well but an implementation incorporating InfluxDB and 52 North SOS in a hybrid architecture performs approximately 12 times faster.

机译：点时间序列是用于真实或建模环境现象的描述的关键数据类型。当数据量很大时，在需要执行计算工作（例如聚合，子集或重新采样）时，或者当需要在上下文中放置数据以进行理解时，将这些数据提供具体挑战。这些问题的某些方面与环境域尤其相关：大传感器网络测量连续环境现象的频率经常在很长一段时间内产生非常大的数据集，并且通常需要丰富的元数据来理解观察的背景。尽管如此，数据和大多数这些挑战都在环境领域中普遍存在，例如金融和工业领域。最近技术的审查说明了高性能，轻量级，专门用于时间序列数据的数据库的新兴趋势。这些数据库倾向于具有不存在或最小的正式元数据能力。相比之下，环境领域拥有具有成熟和全面的元数据模型的传感器观测服务（SOS），但现有实现具有较慢的性能问题。在本文中，我们描述了使用复杂元数据实现大型时间序列数据集的混合方法。我们在单个系统系统中使用三个子系统：代理（Python），一个有效的时间序列数据库（涌入DB）和SOS实现（52北SOS）。这些常规SOS界面一起。代理处理标准SOS查询，并将其发出至52个北部SO或涌入以进行处理。响应直接从52个北部SOS或间接从涌入，通过Python Proxy进入涌入，在那里他们被加工到Waterml中。这使得时间序列数据库的可扩展性和性能优势与SOS的复杂元数据处理结婚。测试表明，最新版本的52个北部SOS配置了Postgres / Postgis数据库，但在混合架构中包含influxDB和52 North SO的实现大约需要12倍。

著录项

来源
《IFIP WG 5.11 international symposium on environmental software systemsISESS》|2015年||共9页
会议地点
作者
Benjamin Leighton; Simon J.D. Cox; Nicholas J. Car; Matthew P. Stenson; Jamie Vleeshouwer; Jonathan Hodge;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词
time; series; timeseries; SOS; OGC; sensor; database;

机译：时间;系列;TimeSeries;SOS;OGC;传感器;数据库;

相似文献

外文文献
中文文献
专利

1. Deep Anomaly Detection for Time-Series Data in Industrial IoT: A Communication-Efficient On-Device Federated Learning Approach [J] . Liu Yi, Garg Sahil, Nie Jiangtian, Internet of Things Journal, IEEE . 2021,第8期

机译：Weep Anomaly检测工业IOT中的时间序列数据：一种通信有效的设备联合学习方法
2. An evolutionary approach for efficient prototyping of large time series datasets [J] . Information Sciences: An International Journal . 2020,第期

机译：大型时间序列数据集有效原型设计的进化方法
3. An efficient approach to mine flexible periodic patterns in time series databases [J] . Ashis Kumar Chanda, Swapnil Saha, Manziba Akanda Nishi, Engineering Applications of Artificial Intelligence . 2015,第SEPa期

机译：在时间序列数据库中挖掘灵活周期模式的有效方法
4. A Best of Both Worlds Approach to Complex, Efficient, Time Series Data Delivery [C] . Benjamin Leighton, Simon J.D. Cox, Nicholas J. Car, IFIP WG 5.11 international symposium on environmental software systems;ISESS . 2015

机译：复杂，高效，时序数据传输的两全其美方法
5. Online monitoring and prediction of complex time series events from nonstationary time series data. [D] . Wang, Shouyi. 2012

机译：从非平稳时间序列数据在线监视和预测复杂的时间序列事件。
6. A Multivariate Timeseries Modeling Approach to Severity of Illness Assessment and Forecasting in ICU with Sparse Heterogeneous Clinical Data [O] . Marzyeh Ghassemi, Marco A.F. Pimentel, Tristan Naumann, -1

机译：具有稀疏异构临床数据的ICU疾病严重程度评估和预测的多元时间序列建模方法
7. Efficient data mining algorithms for time series and complex medical data [O] . Zherdin Andrew 2016

机译：用于时间序列和复杂医疗数据的高效数据挖掘算法

A Best of Both Worlds Approach to Complex, Efficient, Time Series Data Delivery

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅