人工智能培训

搜索

人工智能论文:论随机梯度下降的固有隐私(On the Intrinsic Privacy of Stochastic Gradient Descent)

[复制链接]
sa_group 发表于 2019-12-9 14:39:38 | 显示全部楼层 |阅读模式
sa_group 2019-12-9 14:39:38 823 0 显示全部楼层
人工智能论文:论随机梯度下降的固有隐私(On the Intrinsic Privacy of Stochastic Gradient Descent)已经提出了私有学习算法来确保强大的差异私密性(DP)保证,然而,它们通常以牺牲实用性为代价。同时,随机梯度下降(SGD)包含固有的随机性,尚未用于隐私保护。在这项工作中,我们迈出了分析SGD固有隐私属性的第一步。我们的主要贡献是对凸和非凸目标上SGD的大规模实证分析。我们在3个数据集上评估了由于SGD的随机性引起的内在变异性,并计算了由于内在噪声引起的$ \ epsilon $值。首先,我们表明由于随机采样而导致的模型参数的可变性几乎总是超过由于数据变化而引起的可变性。我们观察到,SGD分别在MNIST,Adult和Forest Covertype数据集上提供了7.8、6.9和2.8的固有\ε值。接下来,我们提出一种方法来增加SGD的固有噪声,以达到所需的\ epsilon $。我们的增强型SGD输出模型在具有相同的隐私保证的情况下胜过现有方法,从而将无噪声效用的差距缩小了0.19%至10.07%。最后,我们表明,SGD敏感性的现有理论界限并不严格。通过经验估计最紧密的界限,我们在$ \ epsilon = 1 $时获得了近乎无噪声的性能,从而将无噪声模型的效用差距缩小了3.13%至100%之间。我们的实验提供了具体的证据,表明在SGD中更改种子可能具有与排除任何给定的训练示例相比,对模型的影响要大得多。通过考虑这种固有的随机性,可以在不牺牲进一步隐私的情况下实现更高的实用性。有了这些结果,我们希望激发研究社区进一步探索和表征SGD中的随机性,其对隐私的影响以及与机器学习通用化的相似性。
Private learning algorithms have been proposed that ensure strongdifferential-privacy (DP) guarantees, however they often come at a cost toutility.Meanwhile, stochastic gradient descent (SGD) contains intrinsicrandomness which has not been leveraged for privacy.In this work, we take thefirst step towards analysing the intrinsic privacy properties of SGD.Ourprimary contribution is a large-scale empirical analysis of SGD on convex andnon-convex objectives.We evaluate the inherent variability due to thestochasticity in SGD on 3 datasets and calculate the $\epsilon$ values due tothe intrinsic noise.First, we show that the variability in model parametersdue to the random sampling almost always exceeds that due to changes in thedata.We observe that SGD provides intrinsic $\epsilon$ values of 7.8, 6.9, and2.8 on MNIST, Adult, and Forest Covertype datasets respectively.Next, wepropose a method to augment the intrinsic noise of SGD to achieve the desired$\epsilon$.Our augmented SGD outputs models that outperform existingapproaches with the same privacy guarantee, closing the gap to noiselessutility between 0.19% and 10.07%.Finally, we show that the existingtheoretical bound on the sensitivity of SGD is not tight.By estimating thetightest bound empirically, we achieve near-noiseless performance at $\epsilon= 1$, closing the utility gap to the noiseless model between 3.13% and 100%.Our experiments provide concrete evidence that changing the seed in SGD islikely to have afar greater impact on the model than excluding any giventraining example.By accounting for this intrinsic randomness, higher utilitycan be achieved without sacrificing further privacy.With these results, wehope to inspire the research community to further explore and characterise therandomness in SGD, its impact on privacy, and the parallels with generalisationin machine learning.人工智能论文:论随机梯度下降的固有隐私(On the Intrinsic Privacy of Stochastic Gradient Descent)
URL地址:https://arxiv.org/abs/1912.02919     ----pdf下载地址:https://arxiv.org/pdf/1912.02919    ----人工智能论文:论随机梯度下降的固有隐私(On the Intrinsic Privacy of Stochastic Gradient Descent)
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则 返回列表 发新帖

sa_group当前离线
新手上路

查看:823 | 回复:0

快速回复 返回顶部 返回列表