Data Compression for Fast Online Stochastic Optimization
Irina Wang, Marta Fochesato, Bartolomeo Stellato
https://arxiv.org/abs/2504.08097 https://arxiv.org/pdf/2504.08097 https://arxiv.org/html/2504.08097
arXiv:2504.08097v1 Announce Type: new
Abstract: We propose an online data compression approach for efficiently solving distributionally robust optimization (DRO) problems with streaming data while maintaining out-of-sample performance guarantees. Our method dynamically constructs ambiguity sets using online clustering, allowing the clustered configuration to evolve over time for an accurate representation of the underlying distribution. We establish theoretical conditions for clustering algorithms to ensure robustness, and show that the performance gap between our online solution and the nominal DRO solution is controlled by the Wasserstein distance between the true and compressed distributions, which is approximated using empirical measures. We provide a regret analysis, proving that the upper bound on this performance gap converges sublinearly to a fixed clustering-dependent distance, even when nominal DRO has access, in hindsight, to the subsequent realization of the uncertainty. Numerical experiments in mixed-integer portfolio optimization demonstrate significant computational savings, with minimal loss in solution quality.