首页> 外文期刊>Information and inference >Robust and resource efficient identification of shallow neural networks by fewest samples
【24h】

Robust and resource efficient identification of shallow neural networks by fewest samples

机译:最少的样本对浅神经网络的强大和资源有效识别

获取原文
获取外文期刊封面目录资料

摘要

We address the structure identification and the uniform approximation of sums of ridge functions f (x) = Σ_(i=1)~m g_i(〈a_i, x〉) on Rd, representing a general form of a shallow feed-forward neural network, from a small number of query samples. Higher order differentiation, as used in our constructive approximations, of sums of ridge functions or of their compositions, as in deeper neural network, yields a natural connection between neural network weight identification and tensor product decomposition identification. In the case of the shallowest feed-forward neural network, second-order differentiation and tensors of order two (i.e., matrices) suffice as we prove in this paper. We use two sampling schemes to perform approximate differentiation—active sampling, where the sampling points are universal, actively and randomly designed, and passive sampling, where sampling points were preselected at random from a distribution with known density. Based on multiple gathered approximated first- and second-order differentials, our general approximation strategy is developed as a sequence of algorithms to perform individual sub-tasks. We first perform an active subspace search by approximating the span of the weight vectors a_1, . . . , am. Then we use a straightforward substitution, which reduces the dimensionality of the problem from d to m. The core of the construction is then the stable and efficient approximation of weights expressed in terms of rank-1 matrices a_i ?a_i, realized by formulating their individual identification as a suitable nonlinear program. We prove the successful identification by this program of weight vectors being close to orthonormal and we also show how we can constructively reduce to this case by a whitening procedure, without loss of any generality. We finally discuss the implementation and the performance of the proposed algorithmic pipeline with extensive numerical experiments, which illustrate and confirm the theoretical results.
机译:我们解决脊函数总和f(x)=σ_(i = 1)〜m g_i()的结构识别和均匀近似,来自少数查询样品。在我们的建设性近似中使用的高阶分化,脊函数或其组成的总和(如更深的神经网络中)在神经网络重量识别与张量产品分解识别之间产生了自然的联系。在最浅的前馈神经网络的情况下,二阶分化和二阶的张量(即矩阵)足够,正如我们在本文中所证明的那样。我们使用两个采样方案进行近似分化 - 活性抽样,其中采样点是通用,主动和随机设计和被动采样的,其中采样点是从具有已知密度分布的分布中随机预选的。基于多个收集的近似一阶和二阶差异,我们的一般近似策略是作为执行单个子任务的算法序列而开发的。我们首先通过近似权重向量A_1的跨度执行活动子空间搜索。 。 。 , 是。然后,我们使用直接的替代,这将问题的维度从d降低到m。然后,构造的核心是根据级别1矩阵A_I?a_i表示的稳定,有效的权重近似,通过将其个体识别作为合适的非线性程序来实现。我们证明了这个体重向量程序的成功识别接近正统的,我们还展示了如何通过美白程序建设性地减少这种情况,而不会丧失任何通用性。我们最终通过广泛的数值实验讨论了所提出的算法管道的实现和性能,从而说明和确认了理论结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号