Reinforcement Learning from Human Feedback
GPTKB entity
Statements (31)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:machine_learning_technique
|
| gptkbp:abbreviation |
gptkb:RLHF
|
| gptkbp:category |
gptkb:artificial_intelligence
gptkb:machine_learning natural language processing AI safety |
| gptkbp:firstDescribed |
2017
|
| gptkbp:goal |
align AI behavior with human values
|
| gptkbp:notableContributor |
gptkb:Paul_Christiano
gptkb:Jan_Leike gptkb:Tom_B._Brown |
| gptkbp:notableFor |
gptkb:ChatGPT
gptkb:Bard gptkb:Claude |
| gptkbp:notablePublication |
Deep Reinforcement Learning from Human Preferences
|
| gptkbp:processor |
collect human feedback
train reward model fine-tune policy |
| gptkbp:relatedTo |
gptkb:reinforcement_learning
preference modeling reward modeling human-in-the-loop |
| gptkbp:usedBy |
gptkb:DeepMind
gptkb:OpenAI gptkb:Anthropic |
| gptkbp:usedIn |
chatbots
AI alignment large language models |
| gptkbp:bfsParent |
gptkb:Large_Language_Models
|
| gptkbp:bfsLayer |
7
|
| https://www.w3.org/2000/01/rdf-schema#label |
Reinforcement Learning from Human Feedback
|