...
首页> 外文期刊>Neural computing & applications >Design issues in Time Series dataset balancing algorithms
【24h】

Design issues in Time Series dataset balancing algorithms

机译:时序数据集平衡算法的设计问题

获取原文
获取原文并翻译 | 示例

摘要

Nowadays, the Internet of Things and the e-Health are producing huge collections of Time Series that are analyzed in order to classify current status or to detect certain events, among others. In two-class problems, when the positive events to detect are infrequent, the gathered data lack balance. Even in unsupervised learning, this imbalance causes models to decrease their generalization capability. In order to solve such problem, Time Series balancing algorithms have been proposed. Time Series balancing algorithms have barely been studied; the different approaches make use of either a single bag of Time Series extracting some of them in order to generate a synthetic new one or ghost points in the distance space. These solutions are suitable when there is one only data source and they are univariate datasets. However, in the context of the Internet of Things, where multiple data sources are available, these approaches may not perform coherently. Besides, up to our knowledge there is not multiple datasources and multivariate TS balancing algorithms in the literature. In this research, we study two main concerns that should be considered when designing balancing Time Series algorithms: on the one hand, the TS balancing algorithms should deal with multiple multivariate data sources; on the other hand, the balancing algorithms should be shape preserving. A new algorithm is proposed for balancing multivariate Time Series datasets, as part of our work. A complete evaluation of the algorithm is performed dealing with two real-world multivariate Time Series datasets coming from the e-Health domain: one about epilepsy crisis identification and the other on fall detection. A thorough analysis of the performance is discussed, showing the advantages of considering the Time Series issues within the balancing algorithm.
机译:如今,物联网和电子健康正在生产巨大的时间序列系列,以便分类当前状态或检测某些事件等。在两级问题中,当检测的正事件不常见时,收集的数据缺乏平衡。即使在无监督的学习中,这种不平衡也会导致模型降低其泛化能力。为了解决这样的问题,已经提出了时间序列平衡算法。时间序列平衡算法几乎没有研究过;不同的方法利用单袋时间序列提取它们中的一些时间,以便在距离空间中产生合成新的一个或鬼波。当只有一个数据源并且它们是单变量数据集时,这些解决方案是合适的。但是,在内容者的上下文中,其中多个数据源可用,这些方法可能不会连贯地执行。此外,概述了我们的知识,文献中没有多个数据区和多元TS平衡算法。在这项研究中,我们研究了在设计平衡时间序列算法时应考虑的两个主要问题:一方面,TS平衡算法应该处理多个多变量数据源;另一方面,平衡算法应该是形状保存。提出了一种用于平衡多变量时间序列数据集的新算法,作为我们工作的一部分。处理算法的完全评估,处理来自电子健康领域的两个现实多变量时间序列数据集:一个关于癫痫危机识别的一个关于癫痫危机识别的一个问题。讨论了对性能的全面分析,显示了考虑平衡算法内的时间序列问题的优点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号