gptkbp:instanceOf
|
large language model
|
gptkbp:activeExpertsPerToken
|
4
|
gptkbp:architecture
|
decoder-only transformer
|
gptkbp:availableOn
|
gptkb:Hugging_Face
gptkb:Databricks_Model_Serving
|
gptkbp:context
|
32K tokens
|
gptkbp:contrastsWith
|
gptkb:Llama_2_70B
gptkb:GPT-3.5
gptkb:Mixtral_8x7B
|
gptkbp:developedBy
|
gptkb:Databricks
|
gptkbp:github
|
https://github.com/databrickslabs/dbrx
|
gptkbp:hasModel
|
base model
instruction-tuned model
|
https://www.w3.org/2000/01/rdf-schema#label
|
DBRX
|
gptkbp:improves
|
gptkb:Llama_2_70B
gptkb:GPT-3.5
gptkb:Mixtral_8x7B
|
gptkbp:intendedUse
|
research
commercial
|
gptkbp:license
|
gptkb:Databricks_Open_Model_License
|
gptkbp:notableFor
|
scalable architecture
high performance on benchmarks
open weights
efficient inference
|
gptkbp:notablePublication
|
gptkb:DBRX:_A_Large,_Open,_Mixture-of-Experts_Language_Model
https://arxiv.org/abs/2403.13485
|
gptkbp:numberOfExperts
|
16
|
gptkbp:openSource
|
true
|
gptkbp:parameter
|
132B
|
gptkbp:releaseYear
|
2024-03-27
|
gptkbp:supports
|
gptkb:law
English
reasoning
question answering
summarization
text generation
math
instruction following
|
gptkbp:trainingDataSize
|
12T tokens
|
gptkbp:trainingDataSource
|
gptkb:Databricks_curated_datasets
public datasets
|
gptkbp:trainingDataType
|
gptkb:text
|
gptkbp:uses
|
Mixture-of-Experts (MoE) architecture
|
gptkbp:bfsParent
|
gptkb:transformer_(machine_learning_model)
|
gptkbp:bfsLayer
|
6
|