Randomized Greedy Learning for Non-monotone Stochastic Submodular Maximization Under Full-bandit Feedback

Fares Fourati, Vaneet Aggarwal, Christopher John Quinn, Mohamed-Slim Alouini

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Scopus citations

Abstract

We investigate the problem of unconstrained combinatorial multi-armed bandits with full-bandit feedback and stochastic rewards for submodular maximization. Previous works investigate the same problem assuming a submodular and monotone reward function. In this work, we study a more general problem, i.e., when the reward function is not necessarily monotone, and the submodularity is assumed only in expectation. We propose Randomized Greedy Learning (RGL) algorithm and theoretically prove that it achieves a 1/2-regret upper bound of Õ(nT 2/3) for horizon T and number of arms n. We also show in experiments that RGL empirically outperforms other full-bandit variants in submodular and non-submodular settings.
Original languageEnglish (US)
Title of host publication26th International Conference on Artificial Intelligence and Statistics, AISTATS 2023
PublisherML Research Press
Pages7455-7471
Number of pages17
StatePublished - Jun 4 2023

Fingerprint

Dive into the research topics of 'Randomized Greedy Learning for Non-monotone Stochastic Submodular Maximization Under Full-bandit Feedback'. Together they form a unique fingerprint.

Cite this