August 2022 Precise statistical analysis of classification accuracies for adversarial training
Adel Javanmard, Mahdi Soltanolkotabi
Author Affiliations +
Ann. Statist. 50(4): 2127-2156 (August 2022). DOI: 10.1214/22-AOS2180


Despite the wide empirical success of modern machine learning algorithms and models in a multitude of applications, they are known to be highly susceptible to seemingly small indiscernible perturbations to the input data known as adversarial attacks. A variety of recent adversarial training procedures have been proposed to remedy this issue. Despite the success of such procedures at increasing accuracy on adversarially perturbed inputs or robust accuracy, these techniques often reduce accuracy on natural unperturbed inputs or standard accuracy. Complicating matters further, the effect and trend of adversarial training procedures on standard and robust accuracy is rather counter intuitive and radically dependent on a variety of factors including the perceived form of the perturbation during training, size/quality of data, model overparameterization, etc. In this paper, we focus on binary classification problems where the data is generated according to the mixture of two Gaussians with general anisotropic covariance matrices and derive a precise characterization of the standard and robust accuracy for a class of minimax adversarially trained models. We consider a general norm-based adversarial model, where the adversary can add perturbations of bounded p norm to each input data, for an arbitrary p1. Our comprehensive analysis allows us to theoretically explain several intriguing empirical phenomena and provide a precise understanding of the role of different problem parameters on standard and robust accuracies.

Funding Statement

The first author was supported in part by Sloan Research Fellowship in Mathematics, Google Faculty Research Award, Adobe Data Science Research Award and the NSF CAREER Award DMS-1844481.
The second author was supported by the Packard Fellowship in Science and Engineering, Sloan Research Fellowship in Mathematics, an NSF-CAREER under award #1846369, the Air Force Office of Scientific Research Young Investigator Program (AFOSR-YIP) under award #FA9550-18-1-0078, DARPA Learning with Less Labels (LwLL) and FastNICS programs, and NSF-CIF awards #1813877 and #2008443.


Download Citation

Adel Javanmard. Mahdi Soltanolkotabi. "Precise statistical analysis of classification accuracies for adversarial training." Ann. Statist. 50 (4) 2127 - 2156, August 2022.


Received: 1 August 2021; Published: August 2022
First available in Project Euclid: 25 August 2022

MathSciNet: MR4474485
zbMATH: 07610765
Digital Object Identifier: 10.1214/22-AOS2180

Primary: 62E20 , 62F12
Secondary: 62J12

Keywords: adversarial training , Binary classification , Precise high-dimensional asymptotics

Rights: Copyright © 2022 Institute of Mathematical Statistics


This article is only available to subscribers.
It is not available for individual sale.

Vol.50 • No. 4 • August 2022
Back to Top