Mixture of experts

GPTKB entity

Statements (41)
Predicate Object
gptkbp:instanceOf machine learning model architecture
gptkbp:advantage scalability
parameter efficiency
specialization
gptkbp:application computer vision
natural language processing
speech recognition
recommendation systems
gptkbp:architecture modular neural network
conditional computation model
gptkbp:challenge load balancing
routing instability
training complexity
gptkbp:citation gptkb:Jacobs,_R._A.,_Jordan,_M._I.,_Nowlan,_S._J.,_&_Hinton,_G._E._(1991)._Adaptive_mixtures_of_local_experts._Neural_Computation,_3(1),_79-87.
gptkbp:component expert networks
gating network
gptkbp:field gptkb:artificial_intelligence
gptkb:machine_learning
gptkbp:hasComponent gating function
multiple expert models
https://www.w3.org/2000/01/rdf-schema#label Mixture of experts
gptkbp:introduced gptkb:Michael_I._Jordan
gptkb:Ronald_A._Jacobs
gptkbp:introducedIn 1991
gptkbp:notableFor gptkb:GShard
gptkb:GPT-4_MoE_variant
gptkb:Google_Switch_Transformer
gptkbp:purpose divide and conquer learning
enable specialization of sub-models
improve model scalability
gptkbp:relatedTo ensemble methods
transformer models
conditional computation
sparse activation
gptkbp:trainer backpropagation
expectation-maximization
gptkbp:usedIn deep learning
ensemble learning
large language models
gptkbp:bfsParent gptkb:Gaussian_mixture_models
gptkbp:bfsLayer 6