iterated distillation and amplification

GPTKB entity

Statements (22)
Predicate Object
gptkbp:instanceOf machine learning technique
AI alignment method
gptkbp:abbreviation gptkb:IDA
gptkbp:amplification uses human overseers to answer complex questions
gptkbp:category AI governance
AI interpretability
gptkbp:describedBy gptkb:AI_Alignment_Forum
gptkb:LessWrong
gptkbp:distillation trains a model to imitate the amplified overseer
gptkbp:goal align powerful AI with human values
https://www.w3.org/2000/01/rdf-schema#label iterated distillation and amplification
gptkbp:iteration repeats amplification and distillation to improve performance
gptkbp:processor alternates amplification and distillation steps
gptkbp:proposedBy gptkb:Paul_Christiano
gptkbp:relatedTo AI safety
distillation
amplification
factored cognition
gptkbp:usedFor scalable oversight
training aligned AI systems
gptkbp:bfsParent gptkb:AI_Alignment
gptkbp:bfsLayer 7