iterated distillation and amplification
GPTKB entity
Statements (22)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:AI_alignment_method
gptkb:machine_learning_technique |
| gptkbp:abbreviation |
gptkb:IDA
|
| gptkbp:amplification |
uses human overseers to answer complex questions
|
| gptkbp:category |
AI governance
AI interpretability |
| gptkbp:describedBy |
gptkb:AI_Alignment_Forum
gptkb:LessWrong |
| gptkbp:distillation |
trains a model to imitate the amplified overseer
|
| gptkbp:goal |
align powerful AI with human values
|
| gptkbp:iteration |
repeats amplification and distillation to improve performance
|
| gptkbp:processor |
alternates amplification and distillation steps
|
| gptkbp:proposedBy |
gptkb:Paul_Christiano
|
| gptkbp:relatedTo |
AI safety
distillation amplification factored cognition |
| gptkbp:usedFor |
scalable oversight
training aligned AI systems |
| gptkbp:bfsParent |
gptkb:AI_Alignment
|
| gptkbp:bfsLayer |
8
|
| https://www.w3.org/2000/01/rdf-schema#label |
iterated distillation and amplification
|