Statements (31)
Predicate | Object |
---|---|
gptkbp:instanceOf |
machine learning accelerator
|
gptkbp:announced |
2022
|
gptkbp:architecture |
custom silicon
|
gptkbp:compatibleWith |
gptkb:AWS_Neuron_SDK
|
gptkbp:designedFor |
machine learning inference
|
gptkbp:feature |
energy efficiency
low latency high throughput on-chip memory NeurON Core NeurON Link high-speed chip interconnect hardware-accelerated matrix multiplication hardware-accelerated stochastic rounding hardware-accelerated transposition |
https://www.w3.org/2000/01/rdf-schema#label |
AWS Inferentia2
|
gptkbp:manufacturer |
gptkb:Amazon_Web_Services
|
gptkbp:predecessor |
gptkb:AWS_Inferentia
|
gptkbp:regionAvailability |
multiple AWS regions
|
gptkbp:supports |
large language models
BF16 FP16 INT8 FP32 transformer models dynamic tensor shapes |
gptkbp:targetUser |
enterprises
machine learning developers |
gptkbp:targetWorkload |
deep learning inference
|
gptkbp:usedIn |
Amazon EC2 Inf2 instances
|
gptkbp:bfsParent |
gptkb:AWS_Inferentia
|
gptkbp:bfsLayer |
7
|