T5 architecture

URI: https://gptkb.org/entity/T5_architecture

GPTKB entity

Statements (53)

Predicate	Object
gptkbp:instanceOf	gptkb:convolutional_neural_network
gptkbp:activatedBy	gptkb:ReLU
gptkbp:architecture	encoder-decoder
gptkbp:availableOn	gptkb:TensorFlow gptkb:Hugging_Face_Transformers
gptkbp:basedOn	Transformer architecture
gptkbp:developedBy	gptkb:Google_Research
gptkbp:encoderDecoder	yes
gptkbp:format	text-to-text
gptkbp:fullName	gptkb:Text-To-Text_Transfer_Transformer
gptkbp:inputSequence	text string
gptkbp:introduced	gptkb:Exploring_the_Limits_of_Transfer_Learning_with_a_Unified_Text-to-Text_Transformer
gptkbp:introducedIn	2019
gptkbp:level	varies by model size
gptkbp:license	Apache 2.0
gptkbp:notableFeature	unified text-to-text framework flexible for various NLP tasks pretrained on large web corpus scalable to large models span-masked language modeling
gptkbp:notableFor	gptkb:GLUE_benchmark gptkb:SuperGLUE_benchmark CNN/Daily Mail summarization SQuAD question answering
gptkbp:notableVariant	gptkb:mT5 gptkb:Flan-T5 gptkb:ByT5
gptkbp:openSource	yes
gptkbp:optimizer	Adafactor
gptkbp:outputSequence	text string
gptkbp:pretrainedModelsAvailable	yes
gptkbp:pretrainingObjective	span corruption
gptkbp:range	60 million to 11 billion
gptkbp:supports	transfer learning few-shot learning multi-task learning zero-shot learning fine-tuning
gptkbp:tokenizerType	gptkb:SentencePiece
gptkbp:trainer	gptkb:Colossal_Clean_Crawled_Corpus_(C4)
gptkbp:type	self-attention
gptkbp:usedFor	gptkb:translator natural language processing question answering summarization text generation text classification
gptkbp:uses	dropout layer normalization relative position embeddings
gptkbp:bfsParent	gptkb:T0-B1
gptkbp:bfsLayer	7
http://www.w3.org/2000/01/rdf-schema#label	T5 architecture