Rotting Infinitely Many-armed Bandits beyond the Worst-case Rotting: An Adaptive Approach

Kim, Jung-hun; Vojnovic, Milan; Yun, Se-Young

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2404

Computer Science > Machine Learning

Title: Rotting Infinitely Many-armed Bandits beyond the Worst-case Rotting: An Adaptive Approach

Authors: Jung-hun Kim, Milan Vojnovic, Se-Young Yun

(Submitted on 22 Apr 2024)

Abstract: In this study, we consider the infinitely many armed bandit problems in rotting environments, where the mean reward of an arm may decrease with each pull, while otherwise, it remains unchanged. We explore two scenarios capturing problem-dependent characteristics regarding the decay of rewards: one in which the cumulative amount of rotting is bounded by $V_T$, referred to as the slow-rotting scenario, and the other in which the number of rotting instances is bounded by $S_T$, referred to as the abrupt-rotting scenario. To address the challenge posed by rotting rewards, we introduce an algorithm that utilizes UCB with an adaptive sliding window, designed to manage the bias and variance trade-off arising due to rotting rewards. Our proposed algorithm achieves tight regret bounds for both slow and abrupt rotting scenarios. Lastly, we demonstrate the performance of our algorithms using synthetic datasets.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2404.14202 [cs.LG]
	(or arXiv:2404.14202v1 [cs.LG] for this version)

Submission history

From: Jung-Hun Kim [view email]
[v1] Mon, 22 Apr 2024 14:11:54 GMT (1700kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2404.14202

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Rotting Infinitely Many-armed Bandits beyond the Worst-case Rotting: An Adaptive Approach

Submission history