The role of regularization in classification of high-dimensional noisy Gaussian mixture

Citation:

F. Mignacco, F. Krzakala, Y. M. Lu, and L. Zdeboravá, “The role of regularization in classification of high-dimensional noisy Gaussian mixture,” in International Conference on Machine Learning (ICML), 2020.

Abstract:

We consider a high-dimensional mixture of two Gaussians in the noisy regime where even an oracle knowing the centers of the clusters misclassifies a small but finite fraction of the points. We provide a rigorous analysis of the generalization error of regularized convex classifiers, including ridge, hinge and logistic regression, in the high-dimensional limit where the number n of samples and their dimension d go to infinity while their ratio is fixed to α=n/d. We discuss surprising effects of the regularization that in some cases allows to reach the Bayes-optimal performances. We also illustrate the interpolation peak at low regularization, and analyze the role of the respective sizes of the two clusters.

arXiv:2002.11544 [stat.ML]

Last updated on 06/17/2020