...
【24h】

Faster learning by reduction of data access time

机译:通过减少数据访问时间更快地学习

获取原文
获取原文并翻译 | 示例
           

摘要

Nowadays, the major challenge in machine learning is the 'Big Data' challenge. The big data problems due to large number of data points or large number of features in each data point, or both, the training of models have become very slow. The training time has two major components: Time to access the data and time to process (learn from) the data. So far, the research has focused only on the second part, i.e., learning from the data. In this paper, we have proposed one possible solution to handle the big data problems in machine learning. The idea is to reduce the training time through reducing data access time by proposing systematic sampling and cyclic/sequential sampling to select mini-batches from the dataset. To prove the effectiveness of proposed sampling techniques, we have used empirical risk minimization, which is commonly used machine learning problem, for strongly convex and smooth case. The problem has been solved using SAG, SAGA, SVRG, SAAG-II and MBSGD (Mini-batched SGD), each using two step determination techniques, namely, constant step size and backtracking line search method. Theoretical results prove similar convergence for systematic and cyclic sampling as the widely used random sampling technique, in expectation. Experimental results with bench marked datasets prove the efficacy of the proposed sampling techniques and show up to six times faster training.
机译:如今,机器学习的主要挑战是“大数据”挑战。由于大量数据点或每个数据点中的大量特征,或两者都有大量数据问题,模型的训练变得非常慢。培训时间有两个主要组成部分:访问数据和时间的时间(从)数据。到目前为止,该研究仅集中在第二部分,即,从数据中学习。在本文中,我们提出了一种可能的解决方案来处理机器学习中的大数据问题。该想法是通过提出系统采样和循环/顺序采样来减少数据访问时间来减少训练时间,以从数据集中选择Mini-Batches。为了证明所提出的采样技术的有效性,我们使用了经验风险最小化,这是常用的机器学习问题,对于强凸和平滑的情况。使用SAG,SAGA,SVRG,SAAG-II和MBSGD(迷你批次SGD)解决了问题,每个都使用两个步骤确定技术,即恒定的步长和回溯线搜索方法。理论结果证明了系统和循环采样的相似会聚作为广泛使用的随机抽样技术,期望。具有替补标记数据集的实验结果证明了提出的采样技术的功效,并显示出速度快六倍的培训。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号