Despite the impressive performance of deep neural networks (DNNs) on numerous vision tasks, they still exhibit yet-to-understand uncouth behaviours. One puzzling behaviour is the reaction of DNNs to various noise attacks, where it has been shown that there exist small adversarial noise that can result in a severe degradation in the performance of DNNs. To rigorously treat this, we derive exact analytic expressions for the first and second moments (mean and variance) of a small piecewise linear (PL) network with a single rectified linear unit (ReLU) layer subject to general Gaussian input. We experimentally show that these expressions are tight under simple linearizations of deeper PL-DNNs, especially popular architectures in the literature (e.g. LeNet and AlexNet). Extensive experiments on image classification show that these expressions can be used to study the behaviour of the output mean of the logits for each class, the inter-class confusion and the pixel-level spatial noise sensitivity of the network. Moreover, we show how these expressions can be used to systematically construct targeted and non-targeted adversarial attacks. Then, we proposed a special estimator DNN, named mixture of linearizations (MoL), and derived the analytic expressions for its output mean and variance, as well. We employed these expressions to train the model to be particularly robust against Gaussian attacks without the need for data augmentation. Upon training this network on a loss that is consolidated with the derived output probabilistic moments, the network is not only robust under very high variance Gaussian attacks but is also as robust as networks that are trained with 20 fold data augmentation.
|Date made available
|KAUST Research Repository