multi-armed bandit problem

GPTKB entity

Statements (25)
Predicate Object
gptkbp:instanceOf gptkb:mathematical_concept
reinforcement learning problem
gptkbp:application Advertising
clinical trials
recommendation systems
gptkbp:describes trade-off between exploration and exploitation
gptkbp:field gptkb:machine_learning
decision theory
statistics
gptkbp:firstDescribed 1933
gptkbp:hasVariant adversarial bandit
contextual bandit
stochastic bandit
https://www.w3.org/2000/01/rdf-schema#label multi-armed bandit problem
gptkbp:namedAfter gptkb:casino
gptkbp:notableContributor gptkb:Richard_Bellman
gptkb:Herbert_Robbins
gptkbp:relatedConcept Markov chain
regret minimization
exploration-exploitation dilemma
gptkbp:supportsAlgorithm gptkb:UCB
Thompson sampling
epsilon-greedy
gptkbp:bfsParent gptkb:Contextual_Bandits_with_Linear_Payoff_Functions
gptkbp:bfsLayer 7