首页> 外文期刊>Distributed and Parallel Databases >A parameter-level parallel optimization algorithm for large-scale spatio-temporal data mining
【24h】

A parameter-level parallel optimization algorithm for large-scale spatio-temporal data mining

机译:大规模时空数据挖掘参数级并行优化算法

获取原文
获取原文并翻译 | 示例
           

摘要

The goal of spatio-temporal data mining is to discover previously unknown but useful patterns from the spatial and temporal data. However, explosive growth of the spatiotemporal data emphasizes the need for developing novel computationally efficient methods for large-scale data mining applications. Since lots of spatiotemporal data mining problems can be converted to an optimization problem, in this paper, we propose an efficient parameter-level parallel optimization algorithm for large-scale spatiotemporal data mining. In detail, most of previous optimization methods are based on gradient descent methods, which iteratively update the model and provide model-level convergence control for all parameters. Namely, they treat all parameters equally and keep updating all parameters until every parameter has converged. However, we find that during the iterative process, the convergence rates of model parameters are different from each other. This may cause redundant computation and reduce the performance. To solve this problem, we propose a parameter-level stochastic gradient descent (plpSGD), in which the convergence of each parameter is considered independently and only unconvergent parameters are updated in each iteration. Moreover, the updating of model parameters are parallelized in plpSGD to further improve the performance of SGD. We have conducted extensive experiments to evaluate the performance of plpSGD. The experimental results show that compared to previous SGD methods, plpSGD can significantly accelerate the convergence of SGD and achieve the excellent scalability with little sacrifice of the solution accuracy.
机译:时空数据挖掘的目标是从空间和时间数据中发现先前未知但有用的模式。然而,时空数据的爆炸性增长强调需要开发用于大规模数据挖掘应用的新型计算有效的方法。由于许多时空数据挖掘问题可以转换为优化问题,从本文提出了一种高效的参数级并行优化算法,用于大规模时空数据挖掘。详细地说,最先前的优化方法中的大多数基于梯度下降方法,它迭代地更新模型并为所有参数提供模型级收敛控制。即,它们同样对待所有参数并继续更新所有参数,直到每个参数都收敛。然而,我们发现在迭代过程中,模型参数的收敛速率彼此不同。这可能导致冗余计算并降低性能。为了解决这个问题,我们提出了一种参数级随机梯度下降(PLPSGD),其中每个参数的收敛被独立地考虑,并且在每次迭代中仅更新不包含的参数。此外,模型参数的更新在PLPSGD中并行化,以进一步提高SGD的性能。我们已经进行了广泛的实验来评估PLPSGD的表现。实验结果表明,与先前的SGD方法相比,PLPSGD可以显着加速SGD的收敛性,实现优异的溶液精度牺牲。

著录项

  • 来源
    《Distributed and Parallel Databases》 |2020年第3期|739-765|共27页
  • 作者单位

    Huazhong Univ Sci & Technol Natl Engn Res Ctr Big Data Technol & Syst Serv Comp Technol & Syst Lab Sch Comp Sci & Technol Wuhan 430074 Peoples R China;

    Huazhong Univ Sci & Technol Natl Engn Res Ctr Big Data Technol & Syst Serv Comp Technol & Syst Lab Sch Comp Sci & Technol Wuhan 430074 Peoples R China;

    Univ Warwick Dept Comp Sci Coventry W Midlands England;

    Huazhong Univ Sci & Technol Natl Engn Res Ctr Big Data Technol & Syst Serv Comp Technol & Syst Lab Sch Comp Sci & Technol Wuhan 430074 Peoples R China;

    Huazhong Univ Sci & Technol Natl Engn Res Ctr Big Data Technol & Syst Serv Comp Technol & Syst Lab Sch Comp Sci & Technol Wuhan 430074 Peoples R China;

    Huazhong Univ Sci & Technol Natl Engn Res Ctr Big Data Technol & Syst Serv Comp Technol & Syst Lab Sch Comp Sci & Technol Wuhan 430074 Peoples R China;

    Huazhong Univ Sci & Technol Natl Engn Res Ctr Big Data Technol & Syst Serv Comp Technol & Syst Lab Sch Comp Sci & Technol Wuhan 430074 Peoples R China;

    Huazhong Univ Sci & Technol Natl Engn Res Ctr Big Data Technol & Syst Serv Comp Technol & Syst Lab Sch Comp Sci & Technol Wuhan 430074 Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Spatio-temporal data mining; Stochastic gradient descent; Block; Convergent rate; Redundant update;

    机译:时空数据挖掘;随机梯度下降;块;收敛速率;冗余更新;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号