首页> 外文会议>IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids >Generative Adversarial Network for Synthetic Time Series Data Generation in Smart Grids
【24h】

Generative Adversarial Network for Synthetic Time Series Data Generation in Smart Grids

机译:智能电网中合成时间序列数据的生成对抗网络

获取原文

摘要

The availability of fine grained time series data is a pre-requisite for research in smart-grids. While data for transmission systems is relatively easily obtainable, issues related to data collection, security and privacy hinder the widespread public availability/accessibility of such datasets at the distribution system level. This has prevented the larger research community from effectively applying sophisticated machine learning algorithms to significantly improve the distribution-level accuracy of predictions and increase the efficiency of grid operations. Synthetic dataset generation has proven to be a promising solution for addressing data availability issues in various domains such as computer vision, natural language processing and medicine. However, its exploration in the smart grid context remains unsatisfactory. Previous works have tried to generate synthetic datasets by modeling the underlying system dynamics: an approach which is difficult, time consuming, error prone and often times infeasible in many problems. In this work, we propose a novel data-driven approach to synthetic dataset generation by utilizing deep generative adversarial networks (GAN) to learn the conditional probability distribution of essential features in the real dataset and generate samples based on the learned distribution. To evaluate our synthetically generated dataset, we measure the maximum mean discrepancy (MMD) between real and synthetic datasets as probability distributions, and show that their sampling distance converges. To further validate our synthetic dataset, we perform common smart grid tasks such as k-means clustering and short-term prediction on both datasets. Experimental results show the efficacy of our synthetic dataset approach: the real and synthetic datasets are indistinguishable by solely examining the output of these tasks.
机译:细粒度时间序列数据的可用性是进行智能电网研究的先决条件。尽管用于传输系统的数据相对容易获得,但与数据收集,安全性和隐私相关的问题阻碍了此类数据集在分发系统级别的广泛公共可用性/可访问性。这阻止了较大的研究团体有效地应用复杂的机器学习算法来显着提高预测的分布级别准确性并提高了网格操作的效率。事实证明,合成数据集生成是解决诸如计算机视觉,自然语言处理和医学等各个领域中的数据可用性问题的有前途的解决方案。然而,其在智能电网环境中的探索仍然不能令人满意。先前的工作试图通过对基础系统动力学建模来生成综合数据集:这种方法既困难,耗时,容易出错,而且在许多问题上通常不可行。在这项工作中,我们提出了一种利用深度生成对抗网络(GAN)来学习真实数据集中基本要素的条件概率分布并根据所获分布生成样本的合成数据集生成的新型数据驱动方法。为了评估我们合成的数据集,我们将真实数据集和合成数据集之间的最大平均差异(MMD)作为概率分布进行测量,并表明它们的采样距离收敛。为了进一步验证合成数据集,我们对两个数据集执行了常见的智能网格任务,例如k均值聚类和短期预测。实验结果表明了我们的综合数据集方法的有效性:仅通过检查这些任务的输出,就无法区分真实数据集和综合数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号