Statements (21)
Predicate | Object |
---|---|
gptkbp:instanceOf |
gptkb:algorithm
|
gptkbp:approach |
Bayesian
|
gptkbp:assumes |
linear reward structure
|
gptkbp:category |
contextual bandit algorithm
|
gptkbp:citation |
Agrawal, S., & Goyal, N. (2013). Thompson Sampling for Contextual Bandits with Linear Payoffs. ICML 2013.
|
gptkbp:estimatedCost |
reward distribution
|
gptkbp:fullName |
Linear Thompson Sampling
|
https://www.w3.org/2000/01/rdf-schema#label |
LinTS
|
gptkbp:input |
context vectors
|
gptkbp:output |
action selection
|
gptkbp:proposedBy |
Navin Goyal
Shipra Agrawal |
gptkbp:publicationYear |
2013
|
gptkbp:relatedTo |
gptkb:LinUCB
Thompson Sampling |
gptkbp:solvedBy |
exploration-exploitation tradeoff
|
gptkbp:usedIn |
gptkb:reinforcement_learning
multi-armed bandit problems |
gptkbp:uses |
linear models
|
gptkbp:bfsParent |
gptkb:Contextual_Bandits_with_Linear_Payoff_Functions
|
gptkbp:bfsLayer |
7
|