Statements (23)
Predicate | Object |
---|---|
gptkbp:instanceOf |
Neural Network Component
|
gptkbp:aggregatesFrom |
Context Vectors
|
gptkbp:allows |
Model to focus on different positions
|
gptkbp:canBe |
Cross-Attention Heads
Self-Attention Heads |
gptkbp:enables |
Multi-head Attention
Parallelization in Attention Computation |
gptkbp:function |
Attend to different parts of input sequence
|
https://www.w3.org/2000/01/rdf-schema#label |
Attention Heads
|
gptkbp:improves |
Model Expressiveness
|
gptkbp:introducedIn |
Vaswani et al. 2017
|
gptkbp:notableFauna |
Hyperparameter
|
gptkbp:output |
Concatenated Attention Results
|
gptkbp:parameter |
Value
Key Query |
gptkbp:usedIn |
gptkb:BERT
gptkb:GPT gptkb:Vision_Transformer Transformer Model |
gptkbp:visualizes |
Attention Maps
|
gptkbp:bfsParent |
gptkb:Transformer_Circuits
|
gptkbp:bfsLayer |
8
|