TY - GEN
T1 - On how to avoid exacerbating spurious correlations when models are overparameterized
AU - Behnia, Tina
AU - Wang, Ke
AU - Thrampoulidis, Christos
N1 - KAUST Repository Item: Exported on 2022-10-07
Acknowledged KAUST grant number(s): CRG8
Acknowledgements: This work is supported by the an NSERC Discovery Grant, by an NSF Grant CCF2009030, and by a CRG8 award from KAUST.
PY - 2022/6/26
Y1 - 2022/6/26
N2 - Overparameterized learning architectures fail to generalize well in the presence of data imbalance even when combined with traditional techniques for mitigating imbalances. This paper focuses on imbalanced classification datasets, in which a small subset of the population - a minority - may contain features that correlate spuriously with the class label. For a parametric family of cross-entropy loss modifications and a representative Gaussian mixture model, we derive non-asymptotic generalization bounds on the worst-group error that shed light on the role of different hyper-parameters. Specifically, we prove that, when appropriately tuned, the recently proposed VS-loss learns a model that is fair towards minorities even when spurious features are strong. On the other hand, alternative heuristics, such as the weighted CE and the LA-loss, can fail dramatically. Compared to previous works, our bounds hold for more general models, they are non-asymptotic, and, they apply even at scenarios of extreme imbalance.
AB - Overparameterized learning architectures fail to generalize well in the presence of data imbalance even when combined with traditional techniques for mitigating imbalances. This paper focuses on imbalanced classification datasets, in which a small subset of the population - a minority - may contain features that correlate spuriously with the class label. For a parametric family of cross-entropy loss modifications and a representative Gaussian mixture model, we derive non-asymptotic generalization bounds on the worst-group error that shed light on the role of different hyper-parameters. Specifically, we prove that, when appropriately tuned, the recently proposed VS-loss learns a model that is fair towards minorities even when spurious features are strong. On the other hand, alternative heuristics, such as the weighted CE and the LA-loss, can fail dramatically. Compared to previous works, our bounds hold for more general models, they are non-asymptotic, and, they apply even at scenarios of extreme imbalance.
UR - http://hdl.handle.net/10754/679606
UR - https://ieeexplore.ieee.org/document/9834839/
UR - http://www.scopus.com/inward/record.url?scp=85136247113&partnerID=8YFLogxK
U2 - 10.1109/isit50566.2022.9834839
DO - 10.1109/isit50566.2022.9834839
M3 - Conference contribution
SN - 9781665421591
SP - 121
EP - 126
BT - 2022 IEEE International Symposium on Information Theory (ISIT)
PB - IEEE
ER -