AWS Inferentia2

GPTKB entity

Statements (31)
Predicate Object
gptkbp:instanceOf machine learning accelerator
gptkbp:announced 2022
gptkbp:architecture custom silicon
gptkbp:compatibleWith gptkb:AWS_Neuron_SDK
gptkbp:designedFor machine learning inference
gptkbp:feature energy efficiency
low latency
high throughput
on-chip memory
NeurON Core
NeurON Link high-speed chip interconnect
hardware-accelerated matrix multiplication
hardware-accelerated stochastic rounding
hardware-accelerated transposition
https://www.w3.org/2000/01/rdf-schema#label AWS Inferentia2
gptkbp:manufacturer gptkb:Amazon_Web_Services
gptkbp:predecessor gptkb:AWS_Inferentia
gptkbp:regionAvailability multiple AWS regions
gptkbp:supports large language models
BF16
FP16
INT8
FP32
transformer models
dynamic tensor shapes
gptkbp:targetUser enterprises
machine learning developers
gptkbp:targetWorkload deep learning inference
gptkbp:usedIn Amazon EC2 Inf2 instances
gptkbp:bfsParent gptkb:AWS_Inferentia
gptkbp:bfsLayer 7