gptkbp:instanceOf
|
large language model
|
gptkbp:activatedBy
|
gptkb:SwiGLU
|
gptkbp:architecture
|
gptkb:Mixture_of_Experts
|
gptkbp:context
|
32,768 tokens
|
gptkbp:developedBy
|
gptkb:Mistral_AI
|
gptkbp:expertModelSize
|
7 billion
|
gptkbp:github
|
https://github.com/mistralai/Mixtral-8x7B
|
gptkbp:hasModel
|
decoder-only transformer
|
https://www.w3.org/2000/01/rdf-schema#label
|
Mixtral-8x7B
|
gptkbp:inferenceSpeed
|
faster than Llama 2 70B
|
gptkbp:instructVersion
|
Mixtral-8x7B-Instruct-v0.1
Mixtral-8x7B-Instruct-v0.2
|
gptkbp:language
|
gptkb:French
gptkb:German
gptkb:Italian
gptkb:Romanian
gptkb:Spanish
Dutch
English
Polish
Portuguese
|
gptkbp:license
|
Apache 2.0
|
gptkbp:modelCard
|
https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
|
gptkbp:notableFor
|
open weights release
high efficiency mixture-of-experts architecture
|
gptkbp:numberOfExperts
|
8
|
gptkbp:openSource
|
true
|
gptkbp:parameter
|
46.7 billion
|
gptkbp:performanceContext
|
competitive with GPT-3.5
|
gptkbp:platform
|
gptkb:microprocessor
gptkb:graphics_card
gptkb:TPU
|
gptkbp:quantizationSupport
|
16-bit
4-bit
8-bit
|
gptkbp:releaseDate
|
2023-12-08
|
gptkbp:supports
|
translator
reasoning
code generation
question answering
summarization
text generation
|
gptkbp:tokenizer
|
gptkb:SentencePiece
|
gptkbp:trainer
|
gptkb:law
gptkb:Wikipedia
gptkb:Common_Crawl
books
web data
|
gptkbp:trainingDataCutoff
|
2023
|
gptkbp:type
|
multi-head attention
|
gptkbp:bfsParent
|
gptkb:Hugging_Face_Hub
gptkb:Mistral_API
|
gptkbp:bfsLayer
|
7
|