T5-3 B-FT

GPTKB entity

Statements (56)

Predicate	Object
gptkbp:instance_of	gptkb:Model
gptkbp:application	natural language processing
gptkbp:architecture	gptkb:T5
gptkbp:attention_mechanism	self-attention
gptkbp:batch_size	large
gptkbp:community_support	active
gptkbp:developed_by	gptkb:Google_Research
gptkbp:drops	0.1
gptkbp:embedding_size	768
gptkbp:evaluates	gptkb:GLUE gptkb:SQu_AD gptkb:MNLI gptkb:RACE QQP Co NLL
gptkbp:fine_tuning_data	domain-specific datasets
gptkbp:has_publications	T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
https://www.w3.org/2000/01/rdf-schema#label	T5-3 B-FT
gptkbp:impact	significant in NLP research
gptkbp:initiated_by	Re LU
gptkbp:input_output	gptkb:text
gptkbp:is_a_framework_for	gptkb:Tensor_Flow
gptkbp:is_adopted_by	widely adopted
gptkbp:is_cited_in	high
gptkbp:is_compared_to	compared to ALBERT compared to Distil BERT compared to GPT-3 compared to Ro BERTa compared to XLNet
gptkbp:is_tasked_with	text-to-text transfer
gptkbp:is_taught_in	variable
gptkbp:language	English
gptkbp:license	Apache License 2.0
gptkbp:max_input_length	512
gptkbp:max_output_length	512
gptkbp:mission	denoising autoencoder
gptkbp:num_heads	gptkb:16
gptkbp:num_layers	gptkb:24
gptkbp:orbital_period	3 billion
gptkbp:performance	state-of-the-art
gptkbp:predecessor	gptkb:BERT
gptkbp:provides_information_on	C4 dataset
gptkbp:release_date	gptkb:2020
gptkbp:repository	gptkb:Git_Hub
gptkbp:successor	gptkb:T5-11_B
gptkbp:tokenization	Word Piece
gptkbp:tuning	gptkb:Adam possible
gptkbp:type	gptkb:Transformers
gptkbp:use_case	gptkb:translator question answering text generation text summarization text classification
gptkbp:bfsParent	gptkb:Noam_Shazeer
gptkbp:bfsLayer	6