Megatron-LM

GPTKB entity

Statements (52)
Predicate Object
gptkbp:instanceOf large language model
gptkbp:citation https://arxiv.org/abs/1909.08053
gptkbp:developedBy gptkb:NVIDIA
gptkbp:firstReleased 2019
gptkbp:hasFeature fast training
mixed precision support
checkpointing
efficient memory usage
gradient accumulation
custom datasets
activation checkpointing
custom tokenization
distributed optimizer
flexible model configuration
https://www.w3.org/2000/01/rdf-schema#label Megatron-LM
gptkbp:license gptkb:Apache_License_2.0
gptkbp:openSource true
gptkbp:optimizedFor gptkb:NVIDIA_GPUs
distributed training
gptkbp:programmingLanguage gptkb:Python
gptkbp:purpose training large transformer models
gptkbp:relatedTo gptkb:PyTorch
deep learning
transformer architecture
gptkbp:repository https://github.com/NVIDIA/Megatron-LM
gptkbp:scalableTo trillions of parameters
gptkbp:supports gptkb:T5
gptkb:GPT-2
gptkb:GPT-3
gptkb:BERT
FP16
data parallelism
mixed precision training
multi-GPU training
pipeline parallelism
tensor parallelism
bfloat16
multi-node training
gptkbp:usedBy gptkb:industry
gptkb:researchers
gptkbp:usedFor natural language processing
text generation
language modeling
fine-tuning
pretraining
gptkbp:uses data parallelism
model parallelism
pipeline parallelism
gptkbp:bfsParent gptkb:NVIDIA_AI_Research
gptkb:Megatron-Turing_NLG
gptkb:GPT-NeoX
gptkbp:bfsLayer 6