Statements (31)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:machine_learning_accelerator
|
| gptkbp:announced |
2022
|
| gptkbp:architecture |
custom silicon
|
| gptkbp:compatibleWith |
gptkb:AWS_Neuron_SDK
|
| gptkbp:designedFor |
machine learning inference
|
| gptkbp:feature |
energy efficiency
low latency high throughput on-chip memory NeurON Core NeurON Link high-speed chip interconnect hardware-accelerated matrix multiplication hardware-accelerated stochastic rounding hardware-accelerated transposition |
| gptkbp:manufacturer |
gptkb:Amazon_Web_Services
|
| gptkbp:predecessor |
gptkb:AWS_Inferentia
|
| gptkbp:regionAvailability |
multiple AWS regions
|
| gptkbp:supports |
large language models
BF16 FP16 INT8 FP32 transformer models dynamic tensor shapes |
| gptkbp:targetUser |
enterprises
machine learning developers |
| gptkbp:targetWorkload |
deep learning inference
|
| gptkbp:usedIn |
Amazon EC2 Inf2 instances
|
| gptkbp:bfsParent |
gptkb:AWS_Inferentia
|
| gptkbp:bfsLayer |
7
|
| https://www.w3.org/2000/01/rdf-schema#label |
AWS Inferentia2
|