Mixture of experts

URI: https://gptkb.org/entity/Mixture_of_experts

GPTKB entity

Predicate	Object
gptkbp:instanceOf	gptkb:machine_learning_model_architecture
gptkbp:advantage	scalability parameter efficiency specialization
gptkbp:application	computer vision natural language processing speech recognition recommendation systems
gptkbp:architecture	modular neural network conditional computation model
gptkbp:challenge	load balancing routing instability training complexity
gptkbp:citation	gptkb:Jacobs,_R._A.,_Jordan,_M._I.,_Nowlan,_S._J.,_&_Hinton,_G._E._(1991)._Adaptive_mixtures_of_local_experts._Neural_Computation,_3(1),_79-87.
gptkbp:component	expert networks gating network
gptkbp:field	gptkb:artificial_intelligence gptkb:machine_learning
gptkbp:hasComponent	gating function multiple expert models
gptkbp:introduced	gptkb:Michael_I._Jordan gptkb:Ronald_A._Jacobs
gptkbp:introducedIn	1991
gptkbp:notableFor	gptkb:GShard gptkb:GPT-4_MoE_variant gptkb:Google_Switch_Transformer
gptkbp:purpose	divide and conquer learning enable specialization of sub-models improve model scalability
gptkbp:relatedTo	ensemble methods transformer models conditional computation sparse activation
gptkbp:trainer	backpropagation expectation-maximization
gptkbp:usedIn	deep learning ensemble learning large language models
gptkbp:bfsParent	gptkb:Gaussian_mixture_models
gptkbp:bfsLayer	7
http://www.w3.org/2000/01/rdf-schema#label	Mixture of experts