gptkb:Perceiver_IO_Transformer
|
cross-attention
|
gptkb:DeBERTa-XLarge
|
disentangled attention
|
gptkb:DeBERTa-XXLarge
|
disentangled attention
|
gptkb:DeBERTa-Base
|
disentangled attention
|
gptkb:Pointer-Generator_Networks
|
yes
|
gptkb:Perceiver_IO_Transformer
|
self-attention
|
gptkb:ViT-L
|
self-attention
|
gptkb:ViT-B
|
Self-attention
|
gptkb:large_language_model
|
self-attention
|
gptkb:DeBERTa-Large
|
disentangled attention
|
gptkb:Transformer_models
|
self-attention
|
gptkb:Google_BERT
|
self-attention
|
gptkb:BERT
|
self-attention
|
gptkb:Transformer_models
|
multi-head attention
|
gptkb:Graph_Attention_Networks
|
learns weights for neighbors
|