AWS Inferentia2

URI: https://gptkb.org/entity/AWS_Inferentia2

GPTKB entity

Predicate	Object
gptkbp:instanceOf	gptkb:machine_learning_accelerator
gptkbp:announced	2022
gptkbp:architecture	custom silicon
gptkbp:compatibleWith	gptkb:AWS_Neuron_SDK
gptkbp:designedFor	machine learning inference
gptkbp:feature	energy efficiency low latency high throughput on-chip memory NeurON Core NeurON Link high-speed chip interconnect hardware-accelerated matrix multiplication hardware-accelerated stochastic rounding hardware-accelerated transposition
gptkbp:manufacturer	gptkb:Amazon_Web_Services
gptkbp:predecessor	gptkb:AWS_Inferentia
gptkbp:regionAvailability	multiple AWS regions
gptkbp:supports	large language models BF16 FP16 INT8 FP32 transformer models dynamic tensor shapes
gptkbp:targetUser	enterprises machine learning developers
gptkbp:targetWorkload	deep learning inference
gptkbp:usedIn	Amazon EC2 Inf2 instances
gptkbp:bfsParent	gptkb:AWS_Inferentia
gptkbp:bfsLayer	7
http://www.w3.org/2000/01/rdf-schema#label	AWS Inferentia2