首页> 外文会议>IEEE International Conference on Communications, Control, and Computing Technologies for Smart Gridss >Generative Adversarial Network for Synthetic Time Series Data Generation in Smart Grids
【24h】

Generative Adversarial Network for Synthetic Time Series Data Generation in Smart Grids

机译:用于智能电网的合成时间序列数据生成的生成对抗网络

获取原文

摘要

The availability of fine grained time series data is a pre-requisite for research in smart-grids. While data for transmission systems is relatively easily obtainable, issues related to data collection, security and privacy hinder the widespread public availability/accessibility of such datasets at the distribution system level. This has prevented the larger research community from effectively applying sophisticated machine learning algorithms to significantly improve the distribution-level accuracy of predictions and increase the efficiency of grid operations. Synthetic dataset generation has proven to be a promising solution for addressing data availability issues in various domains such as computer vision, natural language processing and medicine. However, its exploration in the smart grid context remains unsatisfactory. Previous works have tried to generate synthetic datasets by modeling the underlying system dynamics: an approach which is difficult, time consuming, error prone and often times infeasible in many problems. In this work, we propose a novel data-driven approach to synthetic dataset generation by utilizing deep generative adversarial networks (GAN) to learn the conditional probability distribution of essential features in the real dataset and generate samples based on the learned distribution. To evaluate our synthetically generated dataset, we measure the maximum mean discrepancy (MMD) between real and synthetic datasets as probability distributions, and show that their sampling distance converges. To further validate our synthetic dataset, we perform common smart grid tasks such as k-means clustering and short-term prediction on both datasets. Experimental results show the efficacy of our synthetic dataset approach: the real and synthetic datasets are indistinguishable by solely examining the output of these tasks.
机译:细粒度时间序列数据的可用性是智能电网研究的先决条件。虽然传输系统的数据相对容易获得,但与数据收集,安全性和隐私相关的问题妨碍此类数据集的广泛公共可用性/可访问性在分发系统级别。这阻止了较大的研究界有效地应用了复杂的机器学习算法,从而显着提高了预测的分配准确性,提高了网格运行的效率。已证明合成数据集生成是一个有希望的解决方案,用于解决各个领域的数据可用性问题,例如计算机视觉,自然语言处理和医学。但是,它在智能电网上下文中的探索仍然不令人满意。以前的作品已经尝试通过建模底层系统动态来生成合成数据集:一种困难,耗时,容易出错的方法,并且在许多问题中往往不可行。在这项工作中,我们通过利用深生成的对抗性网络(GAN)来学习真实数据集中基本特征的条件概率分布并基于所学习分布生成样本的条件概率分布来提出一种新的数据驱动的方法。为了评估我们的综合生成的数据集,我们测量真实和合成数据集之间的最大平均差异(MMD)作为概率分布,并显示其采样距离会聚。为了进一步验证我们的合成数据集,我们执行常见的智能电网任务,如K-means群集和两个数据集上的短期预测。实验结果表明,我们的合成数据集方法的功效:通过仅检查这些任务的输出,真实和合成数据集无法区分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号