Statements (60)
Predicate | Object |
---|---|
gptkbp:instance_of |
gptkb:Amazon_Web_Services
|
gptkbp:can_be_extended_by |
custom backends
|
gptkbp:can_handle |
batch requests
|
gptkbp:deployment |
cloud environments
edge devices |
gptkbp:developed_by |
gptkb:NVIDIA
|
gptkbp:has |
community support
|
https://www.w3.org/2000/01/rdf-schema#label |
Triton Inference Server
|
gptkbp:is_available_on |
gptkb:Git_Hub
|
gptkbp:is_compatible_with |
gptkb:Kubernetes
gptkb:Docker various programming languages |
gptkbp:is_designed_for |
low latency
high throughput AI inference tasks scalable inference |
gptkbp:is_designed_to |
simplify inference workflows
|
gptkbp:is_documented_in |
NVIDIA documentation
|
gptkbp:is_integrated_with |
gptkb:NVIDIA_GPUs
NVIDIA software stack |
gptkbp:is_open_source |
gptkb:true
|
gptkbp:is_optimized_for |
NVIDIA hardware
|
gptkbp:is_part_of |
gptkb:NVIDIA_AI_platform
AI model deployment solutions |
gptkbp:is_used_by |
gptkb:researchers
data scientists machine learning engineers |
gptkbp:is_used_for |
real-time inference
batch inference |
gptkbp:is_used_in |
AI applications
production environments |
gptkbp:offers |
HTTP/g RPC APIs
|
gptkbp:provides |
API documentation
load balancing metrics and logging performance optimization user-friendly interface model versioning model serving |
gptkbp:released_in |
gptkb:2020
|
gptkbp:runs_through |
gptkb:NVIDIA_GPUs
CPUs |
gptkbp:suitable_for |
large scale deployments
|
gptkbp:supports |
gptkb:Tensor_Flow
gptkb:Tensor_RT gptkb:Py_Torch gptkb:Oni multiple frameworks model optimization model monitoring Python API C++ API model repository ensemble models model metadata dynamic batching multi-model serving A/ B testing |
gptkbp:bfsParent |
gptkb:DGX_A100
|
gptkbp:bfsLayer |
6
|