Posted On: Aug 13, 2021

AWS Neuron, the SDK for running machine learning inference on AWS Inferentia-based Amazon EC2 Inf1 instances now supports TensorFlow 2. Starting with Neuron 1.15.0 you can execute your TensorFlow 2 BERT based models on Inf1 instances with support for additional models coming soon. To learn more about Neuron TensorFlow 2 support, visit our TensorFlow 2 FAQ page.

We have also updated our resources with new documentation including a tutorial that help you get started with TensorFlow 2, a tutorial that will guide you on how to deploy a HuggingFace BERT model container on Inferentia using AWS Sagemaker hosting, the inference performance page to help you compare and replicate our results and a new application note to help you discover the types of deep learning architectures that will perform well out of the box on Inferentia.

AWS Neuron is natively integrated with popular ML frameworks such as TensorFlow, PyTorch and Apache MXNet. It includes a deep learning compiler, runtime and tools that assist you with extracting the best performance for your applications. To learn more visit the AWS Neuron page and AWS Neuron documentation.

Amazon EC2 Inf1 instances deliver the lowest cost for deep learning inference in the cloud and are available in 23 regions including US East (N. Virginia, Ohio), US West (Oregon, N. California), AWS GovCloud (US-East, US-West), Canada (Central), Europe (Frankfurt, Ireland, London, Milan, Paris, Stockholm), Asia Pacific (Hong Kong, Mumbai, Seoul, Singapore, Sydney, Tokyo), Middle East (Bahrain), South America (São Paulo) and China (Beijing, Ningxia). You can leverage Amazon EC2 Inf1 instances in the region that will best meet your real-time latency requirements for machine learning inference. To learn more visit the Amazon EC2 Inf1 instance page.