Statements (23)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:Neural_Network_Component
|
| gptkbp:aggregatesFrom |
Context Vectors
|
| gptkbp:allows |
Model to focus on different positions
|
| gptkbp:canBe |
Cross-Attention Heads
Self-Attention Heads |
| gptkbp:enables |
Multi-head Attention
Parallelization in Attention Computation |
| gptkbp:function |
Attend to different parts of input sequence
|
| gptkbp:improves |
Model Expressiveness
|
| gptkbp:introducedIn |
Vaswani et al. 2017
|
| gptkbp:notableFauna |
Hyperparameter
|
| gptkbp:output |
Concatenated Attention Results
|
| gptkbp:parameter |
Value
Key Query |
| gptkbp:usedIn |
gptkb:BERT
gptkb:GPT gptkb:Vision_Transformer Transformer Model |
| gptkbp:visualizes |
Attention Maps
|
| gptkbp:bfsParent |
gptkb:Transformer_Circuits
|
| gptkbp:bfsLayer |
8
|
| https://www.w3.org/2000/01/rdf-schema#label |
Attention Heads
|