Mixtral-8x7B

GPTKB entity

Statements (53)
Predicate Object
gptkbp:instanceOf large language model
gptkbp:activatedBy gptkb:SwiGLU
gptkbp:architecture gptkb:Mixture_of_Experts
gptkbp:context 32,768 tokens
gptkbp:developedBy gptkb:Mistral_AI
gptkbp:expertModelSize 7 billion
gptkbp:github https://github.com/mistralai/Mixtral-8x7B
gptkbp:hasModel decoder-only transformer
https://www.w3.org/2000/01/rdf-schema#label Mixtral-8x7B
gptkbp:inferenceSpeed faster than Llama 2 70B
gptkbp:instructVersion Mixtral-8x7B-Instruct-v0.1
Mixtral-8x7B-Instruct-v0.2
gptkbp:language gptkb:French
gptkb:German
gptkb:Italian
gptkb:Romanian
gptkb:Spanish
Dutch
English
Polish
Portuguese
gptkbp:license Apache 2.0
gptkbp:modelCard https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
gptkbp:notableFor open weights release
high efficiency mixture-of-experts architecture
gptkbp:numberOfExperts 8
gptkbp:openSource true
gptkbp:parameter 46.7 billion
gptkbp:performanceContext competitive with GPT-3.5
gptkbp:platform gptkb:microprocessor
gptkb:graphics_card
gptkb:TPU
gptkbp:quantizationSupport 16-bit
4-bit
8-bit
gptkbp:releaseDate 2023-12-08
gptkbp:supports translator
reasoning
code generation
question answering
summarization
text generation
gptkbp:tokenizer gptkb:SentencePiece
gptkbp:trainer gptkb:law
gptkb:Wikipedia
gptkb:Common_Crawl
books
web data
gptkbp:trainingDataCutoff 2023
gptkbp:type multi-head attention
gptkbp:bfsParent gptkb:Hugging_Face_Hub
gptkb:Mistral_API
gptkbp:bfsLayer 7