Constitutional AI: Harmlessness from AI Feedback
GPTKB entity
Statements (33)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:academic_journal
|
| gptkbp:author |
gptkb:Scott_Johnston
gptkb:Andy_Jones gptkb:Jan_Leike gptkb:John_Schulman gptkb:Yuntao_Bai gptkb:Dario_Amodei gptkb:Jakub_Pachocki gptkb:Jared_Kaplan gptkb:Nelson_Elhage gptkb:Rewon_Child gptkb:Tom_Henighan gptkb:Jeffrey_Wu gptkb:Andy_H._Zhou gptkb:Heidy_Khlaaf gptkb:Kamyar_Ghasemipour gptkb:Liane_Lovitt gptkb:Nicholas_Schiefer gptkb:Saurav_Kadavath gptkb:Sheer_El-Showk gptkb:Tom_Conerly |
| gptkbp:citation |
100+
|
| gptkbp:describes |
training language models to be harmless using AI feedback and a set of principles (a constitution)
|
| gptkbp:publicationDate |
2022
|
| gptkbp:publishedBy |
gptkb:Anthropic
|
| gptkbp:topic |
gptkb:Constitutional_AI
harmlessness AI alignment AI feedback |
| gptkbp:url |
https://arxiv.org/abs/2212.08073
|
| gptkbp:bfsParent |
gptkb:Constitutional_AI
|
| gptkbp:bfsLayer |
7
|
| https://www.w3.org/2000/01/rdf-schema#label |
Constitutional AI: Harmlessness from AI Feedback
|