...
首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Adaptive Sampled Softmax with Kernel Based Sampling
【24h】

Adaptive Sampled Softmax with Kernel Based Sampling

机译:具有基于内核的采样的自适应采样Softmax

获取原文
           

摘要

Softmax is the most commonly used output function for multiclass problems and is widely used in areas such as vision, natural language processing, and recommendation. A softmax model has linear costs in the number of classes which makes it too expensive for many real-world problems. A common approach to speed up training involves sampling only some of the classes at each training step. It is known that this method is biased and that the bias increases the more the sampling distribution deviates from the output distribution. Nevertheless, almost all recent work uses simple sampling distributions that require a large sample size to mitigate the bias. In this work, we propose a new class of kernel based sampling methods and develop an efficient sampling algorithm. Kernel based sampling adapts to the model as it is trained, thus resulting in low bias. It can also be easily applied to many models because it relies only on the model’s last hidden layer. We empirically study the trade-off of bias, sampling distribution and sample size and show that kernel based sampling results in low bias with few samples.
机译:Softmax是用于多类问题的最常用输出函数,并广泛用于视觉,自然语言处理和推荐等领域。 softmax模型在类数上具有线性成本,这使其对于许多实际问题而言过于昂贵。加快培训速度的常见方法是在每个培训步骤中仅抽样一些课程。众所周知,这种方法是有偏见的,并且随着采样分布偏离输出分布的增加,偏见也会增加。但是,几乎所有最新工作都使用简单的采样分布,这些采样需要较大的样本量才能减轻偏差。在这项工作中,我们提出了一种新的基于内核的采样方法,并开发了一种有效的采样算法。基于内核的采样在训练时会适应模型,从而导致较低的偏差。由于它仅依赖于模型的最后一个隐藏层,因此也可以轻松地应用于许多模型。我们经验地研究了偏差,采样分布和样本量之间的权衡,并表明基于内核的采样导致少量样本的低偏差。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号