Mixtral-8x7B

URI: https://gptkb.org/entity/Mixtral-8x7B

GPTKB entity

Predicate	Object
gptkbp:instanceOf	gptkb:large_language_model
gptkbp:activatedBy	gptkb:SwiGLU
gptkbp:architecture	gptkb:Mixture_of_Experts
gptkbp:context	32,768 tokens
gptkbp:developedBy	gptkb:Mistral_AI
gptkbp:expertModelSize	7 billion
gptkbp:github	https://github.com/mistralai/Mixtral-8x7B
gptkbp:hasModel	decoder-only transformer
gptkbp:inferenceSpeed	faster than Llama 2 70B
gptkbp:instructVersion	Mixtral-8x7B-Instruct-v0.1 Mixtral-8x7B-Instruct-v0.2
gptkbp:language	gptkb:French gptkb:German gptkb:Italian gptkb:Romanian gptkb:Spanish Dutch English Polish Portuguese
gptkbp:license	Apache 2.0
gptkbp:modelCard	https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
gptkbp:notableFor	open weights release high efficiency mixture-of-experts architecture
gptkbp:numberOfExperts	8
gptkbp:openSource	true
gptkbp:parameter	46.7 billion
gptkbp:performanceContext	competitive with GPT-3.5
gptkbp:platform	gptkb:microprocessor gptkb:graphics_card gptkb:TPU
gptkbp:quantizationSupport	16-bit 4-bit 8-bit
gptkbp:releaseDate	2023-12-08
gptkbp:supports	gptkb:translator reasoning code generation question answering summarization text generation
gptkbp:tokenizer	gptkb:SentencePiece
gptkbp:trainer	gptkb:law gptkb:Wikipedia gptkb:Common_Crawl books web data
gptkbp:trainingDataCutoff	2023
gptkbp:type	multi-head attention
gptkbp:bfsParent	gptkb:Hugging_Face_Hub
gptkbp:bfsLayer	7
http://www.w3.org/2000/01/rdf-schema#label	Mixtral-8x7B