Reinforcement Learning from Human Feedback
GPTKB entity
Statements (31)
Predicate | Object |
---|---|
gptkbp:instanceOf |
machine learning technique
|
gptkbp:abbreviation |
gptkb:RLHF
|
gptkbp:category |
gptkb:artificial_intelligence
gptkb:machine_learning natural language processing AI safety |
gptkbp:firstDescribed |
2017
|
gptkbp:goal |
align AI behavior with human values
|
https://www.w3.org/2000/01/rdf-schema#label |
Reinforcement Learning from Human Feedback
|
gptkbp:notableContributor |
gptkb:Paul_Christiano
gptkb:Jan_Leike gptkb:Tom_B._Brown |
gptkbp:notableFor |
gptkb:ChatGPT
gptkb:Bard gptkb:Claude |
gptkbp:notablePublication |
Deep Reinforcement Learning from Human Preferences
|
gptkbp:processor |
collect human feedback
train reward model fine-tune policy |
gptkbp:relatedTo |
gptkb:reinforcement_learning
preference modeling reward modeling human-in-the-loop |
gptkbp:usedBy |
gptkb:DeepMind
gptkb:OpenAI gptkb:Anthropic |
gptkbp:usedIn |
chatbots
AI alignment large language models |
gptkbp:bfsParent |
gptkb:Large_Language_Models
|
gptkbp:bfsLayer |
5
|