gptkbp:instanceOf
|
Transformer model
|
gptkbp:application
|
Question answering
Text classification
Translation
Text summarization
|
gptkbp:architect
|
Transformer
|
gptkbp:availableIn
|
Hugging Face Model Hub
|
gptkbp:availableSizes
|
768
3072
|
gptkbp:communitySupport
|
Active community
|
gptkbp:developedBy
|
gptkb:Google_Research
|
gptkbp:evaluates
|
gptkb:BLEU
F1 Score
ROUGE
|
https://www.w3.org/2000/01/rdf-schema#label
|
T5-Base
|
gptkbp:improves
|
Supervised learning
|
gptkbp:influencedBy
|
Attention mechanism
Seq2Seq models
|
gptkbp:inputOutput
|
Variable length
|
gptkbp:is_a_platform_for
|
gptkb:PyTorch
TensorFlow
|
gptkbp:keyIssues
|
220 million
|
gptkbp:language
|
English
|
gptkbp:learnsMove
|
5e-5
|
gptkbp:length
|
Variable length
|
gptkbp:losses
|
Cross-entropy loss
|
gptkbp:model
|
gptkb:T5-3B
gptkb:T5-Large
gptkb:T5-11B
Encoder-Decoder
T5-Small
|
gptkbp:outflow
|
0.1
|
gptkbp:pageCount
|
12
|
gptkbp:performance
|
State-of-the-art performance
|
gptkbp:powerOutput
|
Text
|
gptkbp:primaryFunction
|
GeLU
|
gptkbp:produces
|
Softmax
|
gptkbp:project
|
Natural Language Processing
|
gptkbp:prominence
|
12
|
gptkbp:providesTrainingFor
|
C4 dataset
|
gptkbp:relatedTo
|
gptkb:BART
GPT-2
BERT
XLNet
|
gptkbp:releaseYear
|
2020
|
gptkbp:researchInterest
|
High
|
gptkbp:supports
|
Fine-tuning
Few-shot learning
Multi-task learning
Zero-shot learning
|
gptkbp:training
|
Several days
Transfer learning
|
gptkbp:trainingPrograms
|
750 GB
|
gptkbp:tuning
|
gptkb:Adam
|
gptkbp:uses
|
text-to-text framework
|
gptkbp:utilizes
|
Self-attention
Cross-attention
|
gptkbp:weight
|
gptkb:Xavier_initialization
|