Induction Heads in Transformers
GPTKB entity
Statements (25)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:Neural_Network_Mechanism
|
| gptkbp:activationPattern |
Diagonal Pattern in Attention Matrix
|
| gptkbp:analyzes |
Mechanistic Interpretability Researchers
|
| gptkbp:citation |
https://arxiv.org/abs/2211.00593
https://transformer-circuits.pub/2022/induction-heads/index.html |
| gptkbp:describedBy |
A Mechanistic Interpretability Analysis of Induction Heads in GPT-2
|
| gptkbp:discoveredBy |
gptkb:Anthropic
|
| gptkbp:enables |
gptkb:Few-Shot_Learning
Copying Subsequent Tokens Pattern Completion |
| gptkbp:firstDescribed |
2022
|
| gptkbp:foundIn |
Transformer Models
|
| gptkbp:function |
Copying Patterns
Enabling In-Context Learning |
| gptkbp:importantFor |
Generalization in Language Models
|
| gptkbp:mechanismOfAction |
Attend to Previous Occurrences of Token Sequences
|
| gptkbp:relatedTo |
gptkb:Attention_Heads
gptkb:In-Context_Learning |
| gptkbp:studiedIn |
gptkb:GPT-2
gptkb:GPT-3 gptkb:LLaMA gptkb:PaLM |
| gptkbp:bfsParent |
gptkb:Transformer_Circuits
|
| gptkbp:bfsLayer |
8
|
| https://www.w3.org/2000/01/rdf-schema#label |
Induction Heads in Transformers
|