Mixture of Experts Transformer

GPTKB entity

Statements (23)
Predicate Object
gptkbp:instanceOf gptkb:model
gptkb:convolutional_neural_network
gptkbp:enables parameter efficiency
efficient training of large models
scaling to billions of parameters
gptkbp:feature scalability
conditional computation
sparse activation
multiple expert networks
routing mechanism
gptkbp:field gptkb:artificial_intelligence
gptkb:machine_learning
deep learning
gptkbp:firstPublished 2021
https://www.w3.org/2000/01/rdf-schema#label Mixture of Experts Transformer
gptkbp:notablePublication Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity (2021)
gptkbp:relatedTo gptkb:Google
gptkb:Switch_Transformer
gptkb:GShard
gptkbp:usedIn natural language processing
large language models
gptkbp:bfsParent gptkb:MoE_Transformer
gptkbp:bfsLayer 6