Reinforcement Learning from Human Feedback

GPTKB entity

Predicate	Object
gptkbp:instanceOf	gptkb:machine_learning_technique
gptkbp:abbreviation	gptkb:RLHF
gptkbp:category	gptkb:artificial_intelligence gptkb:machine_learning natural language processing AI safety
gptkbp:firstDescribed	2017
gptkbp:goal	align AI behavior with human values
gptkbp:notableContributor	gptkb:Paul_Christiano gptkb:Jan_Leike gptkb:Tom_B._Brown
gptkbp:notableFor	gptkb:ChatGPT gptkb:Bard gptkb:Claude
gptkbp:notablePublication	Deep Reinforcement Learning from Human Preferences
gptkbp:processor	collect human feedback train reward model fine-tune policy
gptkbp:relatedTo	gptkb:reinforcement_learning preference modeling reward modeling human-in-the-loop
gptkbp:usedBy	gptkb:DeepMind gptkb:OpenAI gptkb:Anthropic
gptkbp:usedIn	chatbots AI alignment large language models
gptkbp:bfsParent	gptkb:Large_Language_Models
gptkbp:bfsLayer	7
https://www.w3.org/2000/01/rdf-schema#label	Reinforcement Learning from Human Feedback