TY - GEN
T1 - Imbalance Trouble: Revisiting Neural-Collapse Geometry
AU - Thrampoulidis, Christos
AU - Kini, Ganesh R.
AU - Vakilian, Vala
AU - Behnia, Tina
N1 - KAUST Repository Item: Exported on 2023-07-10
Acknowledged KAUST grant number(s): CRG8
Acknowledgements: Supported by an NSERC Undergraduate Student Research Grant, an NSERC Discovery Grant, NSF CCF-2009030, a CRG8-KAUST award, and by UBC Advanced Research Computing services.
This publication acknowledges KAUST support, but has no KAUST affiliated authors.
PY - 2022/1/1
Y1 - 2022/1/1
N2 - Neural Collapse refers to the remarkable structural properties characterizing the geometry of class embeddings and classifier weights, found by deep nets when trained beyond zero training error. However, this characterization only holds for balanced data. Here we thus ask whether it can be made invariant to class imbalances. Towards this end, we adopt the unconstrained-features model (UFM), a recent theoretical model for studying neural collapse, and introduce Simplex-Encoded-Labels Interpolation (SELI) as an invariant characterization of the neural collapse phenomenon. We prove for the UFM with cross-entropy loss and vanishing regularization that, irrespective of class imbalances, the embeddings and classifiers always interpolate a simplex-encoded label matrix and that their individual geometries are determined by the SVD factors of this same label matrix. We then present extensive experiments on synthetic and real datasets that confirm convergence to the SELI geometry. However, we caution that convergence worsens with increasing imbalances. We theoretically support this finding by showing that unlike the balanced case, when minorities are present, ridge-regularization plays a critical role in tweaking the geometry. This defines new questions and motivates further investigations into the impact of class imbalances on the rates at which first-order methods converge to their preferred solutions.
AB - Neural Collapse refers to the remarkable structural properties characterizing the geometry of class embeddings and classifier weights, found by deep nets when trained beyond zero training error. However, this characterization only holds for balanced data. Here we thus ask whether it can be made invariant to class imbalances. Towards this end, we adopt the unconstrained-features model (UFM), a recent theoretical model for studying neural collapse, and introduce Simplex-Encoded-Labels Interpolation (SELI) as an invariant characterization of the neural collapse phenomenon. We prove for the UFM with cross-entropy loss and vanishing regularization that, irrespective of class imbalances, the embeddings and classifiers always interpolate a simplex-encoded label matrix and that their individual geometries are determined by the SVD factors of this same label matrix. We then present extensive experiments on synthetic and real datasets that confirm convergence to the SELI geometry. However, we caution that convergence worsens with increasing imbalances. We theoretically support this finding by showing that unlike the balanced case, when minorities are present, ridge-regularization plays a critical role in tweaking the geometry. This defines new questions and motivates further investigations into the impact of class imbalances on the rates at which first-order methods converge to their preferred solutions.
UR - http://hdl.handle.net/10754/692841
UR - https://proceedings.neurips.cc/paper_files/paper/2022/hash/ae54ce310476218f26dd48c1626d5187-Abstract-Conference.html
UR - http://www.scopus.com/inward/record.url?scp=85148457255&partnerID=8YFLogxK
M3 - Conference contribution
SN - 9781713871088
BT - 36th Conference on Neural Information Processing Systems, NeurIPS 2022
PB - Neural information processing systems foundation
ER -