iterated distillation and amplification

GPTKB entity

Statements (22)
Predicate Object
gptkbp:instanceOf gptkb:AI_alignment_method
gptkb:machine_learning_technique
gptkbp:abbreviation gptkb:IDA
gptkbp:amplification uses human overseers to answer complex questions
gptkbp:category AI governance
AI interpretability
gptkbp:describedBy gptkb:AI_Alignment_Forum
gptkb:LessWrong
gptkbp:distillation trains a model to imitate the amplified overseer
gptkbp:goal align powerful AI with human values
gptkbp:iteration repeats amplification and distillation to improve performance
gptkbp:processor alternates amplification and distillation steps
gptkbp:proposedBy gptkb:Paul_Christiano
gptkbp:relatedTo AI safety
distillation
amplification
factored cognition
gptkbp:usedFor scalable oversight
training aligned AI systems
gptkbp:bfsParent gptkb:AI_Alignment
gptkbp:bfsLayer 8
https://www.w3.org/2000/01/rdf-schema#label iterated distillation and amplification