首页> 外文会议>Conference on Neural Information Processing Systems >Is Deeper Better only when Shallow is Good?
【24h】

Is Deeper Better only when Shallow is Good?

机译:只有浅才好,深才更好吗?

获取原文

摘要

Understanding the power of depth in feed-forward neural networks is an ongoing challenge in the field of deep learning theory. While current works account for the importance of depth for the expressive power of neural-networks, it remains an open question whether these benefits are exploited during a gradient-based optimization process. In this work we explore the relation between expressivity properties of deep networks and the ability to train them efficiently using gradient-based algorithms. We give a depth separation argument for distributions with fractal structure, showing that they can be expressed efficiently by deep networks, but not with shallow ones. These distributions have a natural coarse-to-fine structure, and we show that the balance between the coarse and fine details has a crucial effect on whether the optimization process is likely to succeed. We prove that when the distribution is concentrated on the fine details, gradient-based algorithms are likely to fail. Using this result we prove that, at least in some distributions, the success of learning deep networks depends on whether the distribution can be approximated by shallower networks, and we conjecture that this property holds in general.
机译:在深度学习理论领域,理解前馈神经网络中深度的力量是一个持续的挑战。虽然目前的工作说明了深度对于神经网络表达能力的重要性,但这些优势是否在基于梯度的优化过程中得到利用仍然是一个悬而未决的问题。在这项工作中,我们探索了深度网络的表达特性与使用基于梯度的算法有效训练它们的能力之间的关系。对于具有分形结构的分布,我们给出了一个深度分离论证,表明它们可以用深度网络有效地表达,但不能用浅网络有效地表达。这些分布具有从粗到细的自然结构,我们表明,粗细节和细细节之间的平衡对优化过程是否可能成功有着至关重要的影响。我们证明了当分布集中在细节上时,基于梯度的算法可能会失败。利用这个结果,我们证明了,至少在某些分布中,学习深度网络的成功取决于该分布是否可以被较浅的网络近似,并且我们猜想这个性质在一般情况下成立。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号