人工智能培训

搜索

人工智能论文:通过最小范数解的两层RELU神经网络的先验泛化误差(A priori generalization error for two-layer ReL

[复制链接]
shanhawk 发表于 2019-12-9 13:26:29 | 显示全部楼层 |阅读模式
shanhawk 2019-12-9 13:26:29 166 0 显示全部楼层
人工智能论文:通过最小范数解的两层RELU神经网络的先验泛化误差(A priori generalization error for two-layer ReLU neural network through  minimum norm solution)通过下面的研究,我们专注于估计由均方误差训练的两层ReLUneural网络(NN)的\ emph {先验}泛化误差,该误差仅取决于初始参数和目标函数。我们首先估计具有最小范数解约束的有限宽度两层ReLU NN的\ emph {先验}泛化误差,这由\ cite {zhang2019type}证明是线性化(wrtparameter)有限宽度的等效解两层NN。随着宽度趋于无穷大,线性化的NN在神经正切核(NTK)方案\ citep {jacot2018neural}中收敛到NN。因此,我们可以推导NTK体制中两层ReLU NN的\ emph {一个先验}泛化误差。 \ cite {arora2019exact}估算NTKregime中的NN与带有梯度训练的有限宽度NN之间的距离。根据\ cite {arora2019exact}中的结果,我们的工作证明了两层ReLU NN的\ emph {apriori}泛化误差界。此估计使用最小范数解的内在隐含偏差,而无需损失函数具有额外的规则性。此\ Emph {先验估计}还意味着NN不会遭受维数的诅咒,并且可以在不需要指数级数量的神经元的情况下实现较小的泛化误差。此外,本文提出的研究思路还可以用于研究有限宽网络的其他性质,例如后泛化误差。
We focus on estimating \emph{a priori} generalization error of two-layer ReLUneural networks (NNs) trained by mean squared error, which only depends oninitial parameters and the target function, through the following researchline.We first estimate \emph{a priori} generalization error of finite-widthtwo-layer ReLU NN with constraint of minimal norm solution, which is proved by\cite{zhang2019type} to be an equivalent solution of a linearized (wrtparameter) finite-widthtwo-layer NN.As the width goes to infinity, thelinearized NN converges to the NN in Neural Tangent Kernel (NTK) regime\citep{jacot2018neural}.Thus, we can derive the \emph{a priori} generalizationerror of two-layer ReLU NN in NTK regime.The distance between NN in a NTKregime and a finite-width NN with gradient training is estimated by\cite{arora2019exact}.Based on the results in \cite{arora2019exact}, our workproves an \emph{a priori} generalization error bound of two-layer ReLU NNs.This estimate uses the intrinsic implicit bias of the minimum norm solutionwithout requiring extra regularity in the loss function.This \emph{a priori}estimate also implies that NN does not suffer from curse of dimensionality, anda small generalization error can be achieved without requiring exponentiallylarge number of neurons.In addition the research line proposed in this papercan also be used to study other properties of the finite-width network, such asthe posterior generalization error.人工智能论文:通过最小范数解的两层RELU神经网络的先验泛化误差(A priori generalization error for two-layer ReLU neural network through  minimum norm solution)
URL地址:https://arxiv.org/abs/1912.03011     ----pdf下载地址:https://arxiv.org/pdf/1912.03011    ----人工智能论文:通过最小范数解的两层RELU神经网络的先验泛化误差(A priori generalization error for two-layer ReLU neural network through  minimum norm solution)
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则 返回列表 发新帖

shanhawk当前离线
新手上路

查看:166 | 回复:0

快速回复 返回顶部 返回列表