Attention Heads

URI: https://gptkb.org/entity/Attention_Heads

GPTKB entity

Predicate	Object
gptkbp:instanceOf	gptkb:Neural_Network_Component
gptkbp:aggregatesFrom	Context Vectors
gptkbp:allows	Model to focus on different positions
gptkbp:canBe	Cross-Attention Heads Self-Attention Heads
gptkbp:enables	Multi-head Attention Parallelization in Attention Computation
gptkbp:function	Attend to different parts of input sequence
gptkbp:improves	Model Expressiveness
gptkbp:introducedIn	Vaswani et al. 2017
gptkbp:notableFauna	Hyperparameter
gptkbp:output	Concatenated Attention Results
gptkbp:parameter	Value Key Query
gptkbp:usedIn	gptkb:BERT gptkb:GPT gptkb:Vision_Transformer Transformer Model
gptkbp:visualizes	Attention Maps
gptkbp:bfsParent	gptkb:Transformer_Circuits
gptkbp:bfsLayer	8
https://www.w3.org/2000/01/rdf-schema#label	Attention Heads