Constitutional AI: Harmlessness from AI Feedback
GPTKB entity
Statements (33)
Predicate | Object |
---|---|
gptkbp:instanceOf |
gptkb:academic_journal
|
gptkbp:author |
gptkb:Scott_Johnston
gptkb:Andy_Jones gptkb:Jan_Leike gptkb:John_Schulman gptkb:Yuntao_Bai gptkb:Dario_Amodei gptkb:Jakub_Pachocki gptkb:Jared_Kaplan gptkb:Nelson_Elhage gptkb:Rewon_Child gptkb:Tom_Henighan gptkb:Jeffrey_Wu gptkb:Andy_H._Zhou gptkb:Heidy_Khlaaf gptkb:Kamyar_Ghasemipour gptkb:Liane_Lovitt gptkb:Nicholas_Schiefer gptkb:Saurav_Kadavath gptkb:Sheer_El-Showk gptkb:Tom_Conerly |
gptkbp:citation |
100+
|
gptkbp:describes |
training language models to be harmless using AI feedback and a set of principles (a constitution)
|
https://www.w3.org/2000/01/rdf-schema#label |
Constitutional AI: Harmlessness from AI Feedback
|
gptkbp:publicationDate |
2022
|
gptkbp:publishedBy |
gptkb:Anthropic
|
gptkbp:topic |
gptkb:Constitutional_AI
harmlessness AI alignment AI feedback |
gptkbp:url |
https://arxiv.org/abs/2212.08073
|
gptkbp:bfsParent |
gptkb:Constitutional_AI
|
gptkbp:bfsLayer |
6
|