NVIDIA NCCL

URI: https://gptkb.org/entity/NVIDIA_NCCL

GPTKB entity

Predicate	Object
gptkbp:instance_of	gptkb:Library
gptkbp:available_at	gptkb:intellectual_property
gptkbp:available_on	gptkb:Linux gptkb:Windows
gptkbp:designed_for	high-performance computing
gptkbp:developed_by	gptkb:NVIDIA
gptkbp:has	tutorials community support user guides performance benchmarks
https://www.w3.org/2000/01/rdf-schema#label	NVIDIA NCCL
gptkbp:integrates_with	gptkb:Tensor_Flow gptkb:Caffe gptkb:MXNet gptkb:Py_Torch
gptkbp:is_compatible_with	gptkb:CUDA
gptkbp:is_optimized_for	gptkb:NVIDIA_GPUs
gptkbp:provides	gptkb:Documentation API for developers error handling performance monitoring scalability asynchronous communication point-to-point communication reduce operations synchronous communication high bandwidth communication broadcast operations sample codes gather operations all-reduce operations scatter operations
gptkbp:purpose	collective communication
gptkbp:released_in	gptkb:2016
gptkbp:supports	gptkb:Ethernet gptkb:NVIDIA_RTX gptkb:NVIDIA_DGX_systems gptkb:band gptkb:NVIDIA_NVLink gptkb:NVIDIA_HPC_SDK gptkb:NVIDIA_A100 gptkb:NVIDIA_T4 gptkb:NVIDIA_V100 data parallelism model parallelism mixed precision training multi-node communication multi-GPU communication single-node communication
gptkbp:used_in	deep learning frameworks
gptkbp:uses	direct communication ring algorithm tree algorithm
gptkbp:bfsParent	gptkb:NVIDIA_V100_Tensor_Core_GPU
gptkbp:bfsLayer	5