Self-Attention Mechanism

URI: https://gptkb.org/entity/Self-Attention_Mechanism

GPTKB entity

Predicate	Object
gptkbp:instanceOf	gptkb:Neural_Network_Component
gptkbp:addressedTo	Efficient Attention Variants Sparse Attention
gptkbp:appliesTo	gptkb:Computer_Vision gptkb:Natural_Language_Processing gptkb:Speech_Recognition Time Series Analysis
gptkbp:complexity	O(n^2) with respect to sequence length
gptkbp:component	Multi-Head Attention
gptkbp:computes	Attention scores
gptkbp:coreOperation	Weighted sum of values
gptkbp:enables	Parallel computation Bidirectional context modeling Contextual representation Long-range dependency modeling Self-contextualization of tokens
gptkbp:influenced	gptkb:actor gptkb:Reformer gptkb:Longformer gptkb:Vision_Transformer Linformer
gptkbp:input	Sequence of vectors
gptkbp:introduced	gptkb:Ashish_Vaswani
gptkbp:introducedIn	gptkb:Attention_Is_All_You_Need 2017
gptkbp:limitation	Quadratic memory usage Scalability to long sequences
gptkbp:output	Sequence of vectors
gptkbp:purpose	Capture dependencies between input tokens
gptkbp:relatedTo	Cross-Attention Scaled Dot-Product Attention
gptkbp:replacedBy	Recurrent Neural Networks in NLP
gptkbp:requires	Positional Encoding
gptkbp:usedIn	gptkb:BERT gptkb:GPT Transformer Model
gptkbp:uses	Value Key Query
gptkbp:variant	Attention Mechanism
gptkbp:bfsParent	gptkb:Large_Language_Models
gptkbp:bfsLayer	7
http://www.w3.org/2000/01/rdf-schema#label	Self-Attention Mechanism