Data augmentation is widely applied and has shown its benefits in different machine learning tasks. However, as recently observed, it may have an unfair effect in multi-class classification. While data augmentation generally improves the overall performance (and therefore is beneficial for many classes), it can actually be detrimental for other classes, which can be problematic in some application domains. In this paper, to counteract this phenomenon, we propose CLAM, a CLAss-dependent Multiplicative-weights method. To derive it, we first formulate the training of a classifier as a non-linear optimization problem that aims at simultaneously maximizing the individual class performances and balancing them. By rewriting this optimization problem as an adversarial two-player game, we propose a novel multiplicative weights algorithm, for which we prove the convergence. Interestingly, our formulation also reveals that the class-dependent effects of data augmentation is not due to data augmentation only, but is in fact a general phenomenon. Our empirical results over six datasets demonstrate that the performance of learned classifiers is indeed more fairly distributed over classes, with only limited impact on the average accuracy.
@article{
jiang2025understanding,
title={Understanding and Reducing the Class-Dependent Effects of Data Augmentation with A Two-Player Game Approach},
author={Yunpeng Jiang and Yutong Ban and Paul Weng},
journal={Transactions on Machine Learning Research},
issn={2835-8856},
year={2025},
url={https://openreview.net/forum?id=zNsfgCns7x},
note={}
}