Reinforcement Learning from Human Feedback

GPTKB entity

Statements (31)
Predicate Object
gptkbp:instanceOf machine learning technique
gptkbp:abbreviation gptkb:RLHF
gptkbp:category gptkb:artificial_intelligence
gptkb:machine_learning
natural language processing
AI safety
gptkbp:firstDescribed 2017
gptkbp:goal align AI behavior with human values
https://www.w3.org/2000/01/rdf-schema#label Reinforcement Learning from Human Feedback
gptkbp:notableContributor gptkb:Paul_Christiano
gptkb:Jan_Leike
gptkb:Tom_B._Brown
gptkbp:notableFor gptkb:ChatGPT
gptkb:Bard
gptkb:Claude
gptkbp:notablePublication Deep Reinforcement Learning from Human Preferences
gptkbp:processor collect human feedback
train reward model
fine-tune policy
gptkbp:relatedTo gptkb:reinforcement_learning
preference modeling
reward modeling
human-in-the-loop
gptkbp:usedBy gptkb:DeepMind
gptkb:OpenAI
gptkb:Anthropic
gptkbp:usedIn chatbots
AI alignment
large language models
gptkbp:bfsParent gptkb:Large_Language_Models
gptkbp:bfsLayer 5