Megatron-LM

URI: https://gptkb.org/entity/Megatron-LM

GPTKB entity

Predicate	Object
gptkbp:instanceOf	gptkb:large_language_model
gptkbp:citation	https://arxiv.org/abs/1909.08053
gptkbp:developedBy	gptkb:NVIDIA
gptkbp:firstReleased	2019
gptkbp:hasFeature	fast training mixed precision support checkpointing efficient memory usage gradient accumulation custom datasets activation checkpointing custom tokenization distributed optimizer flexible model configuration
gptkbp:license	gptkb:Apache_License_2.0
gptkbp:openSource	true
gptkbp:optimizedFor	gptkb:NVIDIA_GPUs distributed training
gptkbp:programmingLanguage	gptkb:Python
gptkbp:purpose	training large transformer models
gptkbp:relatedTo	gptkb:PyTorch deep learning transformer architecture
gptkbp:repository	https://github.com/NVIDIA/Megatron-LM
gptkbp:scalableTo	trillions of parameters
gptkbp:supports	gptkb:T5 gptkb:GPT-2 gptkb:GPT-3 gptkb:BERT FP16 data parallelism mixed precision training multi-GPU training pipeline parallelism tensor parallelism bfloat16 multi-node training
gptkbp:usedBy	gptkb:industry gptkb:researchers
gptkbp:usedFor	natural language processing text generation language modeling fine-tuning pretraining
gptkbp:uses	data parallelism model parallelism pipeline parallelism
gptkbp:bfsParent	gptkb:Transformer_models gptkb:Language_modeling gptkb:Megatron-Turing_NLG
gptkbp:bfsLayer	7
http://www.w3.org/2000/01/rdf-schema#label	Megatron-LM