Induction Heads in Transformers
GPTKB entity
Statements (25)
Predicate | Object |
---|---|
gptkbp:instanceOf |
Neural Network Mechanism
|
gptkbp:activationPattern |
Diagonal Pattern in Attention Matrix
|
gptkbp:analyzes |
Mechanistic Interpretability Researchers
|
gptkbp:citation |
https://arxiv.org/abs/2211.00593
https://transformer-circuits.pub/2022/induction-heads/index.html |
gptkbp:describedBy |
A Mechanistic Interpretability Analysis of Induction Heads in GPT-2
|
gptkbp:discoveredBy |
gptkb:Anthropic
|
gptkbp:enables |
gptkb:Few-Shot_Learning
Copying Subsequent Tokens Pattern Completion |
gptkbp:firstDescribed |
2022
|
gptkbp:foundIn |
Transformer Models
|
gptkbp:function |
Copying Patterns
Enabling In-Context Learning |
https://www.w3.org/2000/01/rdf-schema#label |
Induction Heads in Transformers
|
gptkbp:importantFor |
Generalization in Language Models
|
gptkbp:mechanismOfAction |
Attend to Previous Occurrences of Token Sequences
|
gptkbp:relatedTo |
gptkb:Attention_Heads
gptkb:In-Context_Learning |
gptkbp:studiedIn |
gptkb:GPT-2
gptkb:GPT-3 gptkb:LLaMA gptkb:PaLM |
gptkbp:bfsParent |
gptkb:Transformer_Circuits
|
gptkbp:bfsLayer |
8
|