T5 architecture

GPTKB entity

Statements (53)
Predicate Object
gptkbp:instanceOf gptkb:convolutional_neural_network
gptkbp:activatedBy gptkb:ReLU
gptkbp:architecture encoder-decoder
gptkbp:availableOn gptkb:TensorFlow
gptkb:Hugging_Face_Transformers
gptkbp:basedOn Transformer architecture
gptkbp:developedBy gptkb:Google_Research
gptkbp:encoderDecoder yes
gptkbp:format text-to-text
gptkbp:fullName gptkb:Text-To-Text_Transfer_Transformer
https://www.w3.org/2000/01/rdf-schema#label T5 architecture
gptkbp:inputSequence text string
gptkbp:introduced gptkb:Exploring_the_Limits_of_Transfer_Learning_with_a_Unified_Text-to-Text_Transformer
gptkbp:introducedIn 2019
gptkbp:level varies by model size
gptkbp:license Apache 2.0
gptkbp:notableFeature unified text-to-text framework
flexible for various NLP tasks
pretrained on large web corpus
scalable to large models
span-masked language modeling
gptkbp:notableFor gptkb:GLUE_benchmark
gptkb:SuperGLUE_benchmark
CNN/Daily Mail summarization
SQuAD question answering
gptkbp:notableVariant gptkb:mT5
gptkb:Flan-T5
gptkb:ByT5
gptkbp:openSource yes
gptkbp:optimizer Adafactor
gptkbp:outputSequence text string
gptkbp:pretrainedModelsAvailable yes
gptkbp:pretrainingObjective span corruption
gptkbp:range 60 million to 11 billion
gptkbp:supports transfer learning
few-shot learning
multi-task learning
zero-shot learning
fine-tuning
gptkbp:tokenizerType gptkb:SentencePiece
gptkbp:trainer gptkb:Colossal_Clean_Crawled_Corpus_(C4)
gptkbp:type self-attention
gptkbp:usedFor natural language processing
translator
question answering
summarization
text generation
text classification
gptkbp:uses dropout
layer normalization
relative position embeddings
gptkbp:bfsParent gptkb:T0-B1
gptkbp:bfsLayer 7