Contextual Bandits with Linear Payoff Functions

GPTKB entity

Predicate	Object
gptkbp:instanceOf	gptkb:model
gptkbp:application	gptkb:Advertising personalized recommendation
gptkbp:challenge	partial feedback
gptkbp:contextualInformation	used
gptkbp:example	gptkb:LinTS gptkb:LinUCB
gptkbp:explorationExploitationTradeoff	present
gptkbp:field	gptkb:reinforcement_learning online learning
gptkbp:goal	maximize cumulative reward
gptkbp:input	context vector
gptkbp:notablePublication	gptkb:A_Contextual-Bandit_Approach_to_Personalized_News_Article_Recommendation_(Li_et_al.,_2010) gptkb:Thompson_Sampling_for_Contextual_Bandits_with_Linear_Payoff_Functions_(Agrawal_&_Goyal,_2013)
gptkbp:output	action selection
gptkbp:parameter	online regression
gptkbp:payoffFunction	linear
gptkbp:proposedBy	2000s
gptkbp:regretBound	sublinear
gptkbp:relatedTo	gptkb:multi-armed_bandit_problem
gptkbp:rewardModel	linear function of context
gptkbp:bfsParent	gptkb:Lihong_Li
gptkbp:bfsLayer	6
http://www.w3.org/2000/01/rdf-schema#label	Contextual Bandits with Linear Payoff Functions