aws-inference-benchmark

Repository with the code for running deep learning inference benchmarks on different AWS instances and service types.

Copilot example

This example demonstrates how to deploy a deep learning model for image inference using ONNX on Amazon ECS/Fargate with AWS Copilot. This project provides an easy-to-follow example and a scalable solution for serving deep learning models in the cloud.

Requirements

Python 3.6 or later
Docker
AWS CLI
AWS Copilot

Deploy

Clone repository

git clone https://github.com/ryfeus/aws-inference-benchmark.git
cd copilot/cpu/aws-copilot-inference-service

Initialize the environment and deploy the application.

copilot env init
copilot deploy

Make single prediction

curl -X POST -H "Content-Type: image/jpeg" --data-binary "@flower.png" http://<prefix>.us-east-1.elb.amazonaws.com/predict

Benchmark using apache benchmark

ab -n 10 -c 10 -p flower.png -T image/jpeg http://<prefix>.us-east-1.elb.amazonaws.com/predict

Run locally

Build the Docker image

docker build -t image-inference .

Run the Docker container

docker run --rm -p 8080:8080 image-inference

Make a prediction using the REST API

curl -X POST -H "Content-Type: image/jpeg" --data-binary "@flower.png" http://localhost:8080/predict

Test

Install the development dependencies

pip install -r dev-requirements.txt

Run the tests

pytest -v test_inference.py

Copilot LLM example

This example demonstrates how to deploy large language model for text generation using transformers library on Amazon ECS/Fargate with AWS Copilot. This project provides an easy-to-follow example and a scalable solution for serving deep learning models in the cloud.

Deploy

Clone repository

git clone https://github.com/ryfeus/aws-inference-benchmark.git
cd copilot/transformers/aws-copilot-inference-service

Clone model from Hugging Face repo. Example - LaMini T5 223M

git lfs install
git clone https://huggingface.co/MBZUAI/LaMini-T5-223M.git
mv LaMini-T5-223M model

Initialize the environment and deploy the application.

copilot env init
copilot deploy

Make single prediction

curl -X POST -H "Content-Type: application/json" -d '{"instruction":"Main tour attractions in Rome:?"}' http://<prefix>.us-east-1.elb.amazonaws.com/predict

Run locally

Build the Docker image

docker build -t llm-inference .

Run the Docker container

docker run --rm -p 8080:8080 llm-inference

Make a prediction using the REST API

curl -X POST -H "Content-Type: application/json" -d '{"instruction":"Main tour attractions in Rome:?"}' http://localhost:8080/predict

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
c-instance		c-instance
copilot		copilot
inf-instance		inf-instance
lambda-llama_cpp		lambda-llama_cpp
lambda-onnx-arm		lambda-onnx-arm
lambda-onnx-snapstart		lambda-onnx-snapstart
lambda-onnx		lambda-onnx
lambda-tensorflow		lambda-tensorflow
lambda-tflite		lambda-tflite
p-instance		p-instance
sagemaker		sagemaker
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

ryfeus/aws-inference-benchmark

Folders and files

Latest commit

History

Repository files navigation

aws-inference-benchmark

Copilot example

Requirements

Deploy

Run locally

Build the Docker image

Run the Docker container

Make a prediction using the REST API

Test

Install the development dependencies

Run the tests

Copilot LLM example

Deploy

Run locally

Build the Docker image

Run the Docker container

Make a prediction using the REST API

About

Resources

License

Stars

Watchers

Forks

Languages