Transformer model

URI: https://gptkb.org/entity/Transformer_model

GPTKB entity

Predicate	Object
gptkbp:instanceOf	gptkb:convolutional_neural_network
gptkbp:architecture	encoder-decoder
gptkbp:component	feed-forward neural network positional encoding self-attention mechanism
gptkbp:decoderComponent	decoder
gptkbp:encoderComponent	encoder
gptkbp:features	scalability flexibility parallelization long-range dependency modeling
gptkbp:inspiredBy	gptkb:T5 gptkb:BERT gptkb:GPT gptkb:Vision_Transformer
gptkbp:introduced	gptkb:Attention_Is_All_You_Need gptkb:Vaswani_et_al.
gptkbp:introducedIn	2017
gptkbp:limitation	quadratic memory complexity expensive training large data requirements
gptkbp:notableFor	gptkb:Google_Translate gptkb:OpenAI_GPT gptkb:AlphaFold gptkb:ChatGPT gptkb:DALL-E gptkb:Stable_Diffusion gptkb:BERT
gptkbp:openSource	gptkb:TensorFlow gptkb:PyTorch gptkb:Hugging_Face_Transformers
gptkbp:parameter	scalable
gptkbp:replacedBy	gptkb:long_short-term_memory gptkb:recurrent_neural_network convolutional neural network (for some tasks)
gptkbp:trainer	supervised learning
gptkbp:type	multi-head attention
gptkbp:usedFor	machine translation natural language processing image processing speech processing text generation
gptkbp:variant	gptkb:actor gptkb:Switch_Transformer gptkb:Reformer gptkb:ALBERT gptkb:BigBird gptkb:DistilBERT gptkb:Longformer gptkb:Sparse_Transformer gptkb:Vision_Transformer Linformer
gptkbp:bfsParent	gptkb:Niki_Parmar gptkb:Neural_Machine_Translation_by_Jointly_Learning_to_Align_and_Translate
gptkbp:bfsLayer	8
http://www.w3.org/2000/01/rdf-schema#label	Transformer model