gptkbp:publishes
|
gptkb:Attention_Is_All_You_Need
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
XL Net: Generalized Autoregressive Pretraining for Language Understanding
|