logo
Menu
How to host a Plotly Dash app on AWS ECS

How to host a Plotly Dash app on AWS ECS

Creating the infrastructure using AWS Cloud Development Kit (CDK) in Python

Published Jan 17, 2024
This article presents the steps to host a Plotly Dash app on AWS Elastic Container Service with AWS Fargate, the serverless container solution from AWS. Part of the steps is how to create the Docker image of the application, as required by AWS ECS to host an application on it.
The required infrastructure to host the Plotly Dash application will be built using the AWS Cloud Development Kit (CDK) tool, in Python.
The Plotly Dash app presented here is a very simple dashboard, based on this tutorial. The most important point here is how to create the required infrastructure to host this application on AWS.
In this way, this article can be a good start to create a data analytics application on AWS.

The architecture:

The idea is to use the AWS ECS service with the AWS Fargate to host the Plotly Dash app with at least two instances. Also, the data source of the dashboard will be stored in a S3 bucket as a CSV file.
To this case, the AWS Application Load Balancer is good choice, but it's necessary to configure it properly to let the dashboard be presented to the user in the Web interface.
The following picture is a simplified diagram of the proposed solution:
Project architecture
Plotly Dash app hosted on AWS ECS
The Docker image of the Plotly Dash application will be hosted by a private repository, created on the AWS ECR service.

Creating the CDK project:

To create the proposed infrastructure, the AWS Cloud Development kit will be used.
To start, just create a new CDK project with the following command:
cdk init app --language python
This command will create the project using the Python language.
The next sections describe the required parts to be created in the CDK project, as stacks.

AWS ECR repository:

The ECR repository needs to be created to host the Plotly Dash Docker image that will be generated later in this article.
The following code creates a CloudFormation stack with an ECR repository. This repository will be used by the ECS task definition, to define the location of the Plotly Dash Docker image.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
from aws_cdk import (
Stack,
aws_ecr as _ecr, RemovalPolicy
)
from constructs import Construct

class EcrStack(Stack):
def __init__(self, scope: Construct, id: str, **kwargs) -> None:
super().__init__(scope, id, **kwargs)

plotly_dash_repository = _ecr.Repository(self, "PlotlyDash",
repository_name="plotly-dash",
removal_policy=RemovalPolicy.DESTROY,
image_tag_mutability=_ecr.TagMutability.IMMUTABLE,
)
self._output_props = {
'plotly_dash_repository': plotly_dash_repository
}

@property
def outputs(self):
return self._output_props

AWS VPC:

As explained in the project architecture, the Plotly Dash application will be hosted in an AWS ECS service, that's why an AWS VPC is required to build this infrastructure.
The following code creates a new VPC, with the default configuration, which includes two availability zones with a NAT Gateway.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
from aws_cdk import (
Stack,
aws_ec2 as _ec2
)
from constructs import Construct

class VpcStack(Stack):
def __init__(self, scope: Construct, id: str, **kwargs) -> None:
super().__init__(scope, id, **kwargs)

vpc = _ec2.Vpc(self, "VpcPlotlyDash",
vpc_name="VpcPlotlyDash",
max_azs=2
)

self._output_props = {
'vpc': vpc
}

@property
def outputs(self):
return self._output_props

AWS ECS Cluster:

As this project uses the AWS Fargate, the AWS ECS cluster creation is very simple, as described in the following code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
from aws_cdk import (
Stack,
aws_ecs as _ecs
)
from constructs import Construct

class ClusterStack(Stack):
def __init__(self, scope: Construct, id: str, props, **kwargs) -> None:
super().__init__(scope, id, **kwargs)

cluster = _ecs.Cluster(self, "ClusterPlotlyDash",
cluster_name="ClusterPlotlyDash",
container_insights=True,
vpc=props['vpc']
)

self._output_props = {
'cluster': cluster
}

@property
def outputs(self):
return self._output_props

AWS Application Load Balancer:

One of the key AWS resources to host this application is the Application Load Balancer, but for now, just the main part of this resource needs to be created.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
from aws_cdk import (
Stack,
aws_elasticloadbalancingv2 as _elbv2
)
from constructs import Construct

class AlbStack(Stack):
def __init__(self, scope: Construct, id: str, props, **kwargs) -> None:
super().__init__(scope, id, **kwargs)

alb = _elbv2.ApplicationLoadBalancer(self, 'Alb',
vpc=props['vpc'],
internet_facing=True,
load_balancer_name='AlbPlotlyDash',
)

self._output_props = {
'alb': alb
}

@property
def outputs(self):
return self._output_props
The AWS ECS service section of this article will create the remaining pieces of the Application Load Balancer and their configurations.

AWS S3 Bucket:

The AWS S3 bucket is a good option to store the dashboard data source of this project, as a CSV file.
The following code creates a new CloudFormation stack with a new S3 bucket.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import aws_cdk as _cdk

from aws_cdk import (
Stack,
aws_s3 as _s3,
)
from constructs import Construct

class S3Stack(Stack):
def __init__(self, scope: Construct, id: str, **kwargs) -> None:
super().__init__(scope, id, **kwargs)

bucket = _s3.Bucket(self, "PlotlyDashBucket",
removal_policy=_cdk.RemovalPolicy.DESTROY)

self._output_props = {
'bucket': bucket
}

@property
def outputs(self):
return self._output_props

AWS ECS Service:

Now it's time to create the AWS ECS service to effectively host the Plotly Dash application. This is the most complex stack in this project.
The following code has the whole implementation, but next there are explanations about each part.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
import aws_cdk as _cdk

from aws_cdk import (
Stack,
aws_ecs as _ecs,
aws_elasticloadbalancingv2 as _elbv2,
aws_ec2 as _ec2,
aws_logs as _logs,
)
from constructs import Construct

class DashAppService(Stack):
def __init__(self, scope: Construct, id: str, props, **kwargs) -> None:
super().__init__(scope, id, **kwargs)

task_definition = _ecs.FargateTaskDefinition(self, "TaskDefinition",
cpu=512,
memory_limit_mib=4096,
family="plotly_dash_app"
)
props['bucket'].grant_read(task_definition.task_role)

task_definition.add_container("PlotlyDashContainer",
image=_ecs.ContainerImage.from_ecr_repository(props['plotly_dash_repository'],
"1.0.0"),
container_name="plotlyDashApp",
logging=_ecs.LogDrivers.aws_logs(
stream_prefix="PlotlyDash",
log_retention=_logs.RetentionDays.ONE_MONTH,
),
port_mappings=[_ecs.PortMapping(container_port=8050,
protocol=_ecs.Protocol.TCP,
)],
environment={
"BUCKET_NAME": props['bucket'].bucket_name
}
)

service = _ecs.FargateService(self, "PlotlyDashService",
service_name="PlotlyDashService",
task_definition=task_definition,
cluster=props['cluster'],
desired_count=2,
assign_public_ip=False,
)
props['plotly_dash_repository'].grant_pull(service.task_definition.task_role)
service.connections.security_groups[0].add_ingress_rule(
_ec2.Peer.ipv4(props['vpc'].vpc_cidr_block), _ec2.Port.tcp(8050))

alb_listener = props['alb'].add_listener("PlotlyDashListener",
port=8050,
protocol=_elbv2.ApplicationProtocol.HTTP,
open=True,
)
alb_listener.add_targets("PlotlyDashServiceAlbTarget",
target_group_name="PlotlyDashServiceAlbTarget",
port=8050,
protocol=_elbv2.ApplicationProtocol.HTTP,
stickiness_cookie_duration=_cdk.Duration.minutes(10),
targets=[service],
deregistration_delay=_cdk.Duration.seconds(30),
health_check={
"path": "/health",
"interval": _cdk.Duration.seconds(30),
"timeout": _cdk.Duration.seconds(10),
"port": "8050",
"enabled": True,
}
)
First, the AWS ECS task definition is created, defining the amount of CPU and memory to be used by the application. This task definition has a task role, which needs to have the permission to read the created S3 bucket, where the CSV to be used the Plotly application is.
Then, the application container is added to this task definition. Note that the created ECR repository is the source of the Docker image. The port mappings attribute has the network configuration required by the Plotly Dash application. The environment attribute carries the bucket name, which will be used by the AWS SDK inside the Plotly Dash application.
Next, the Fargate service is created using the created task definition. Note that this service will create two instances of the application, without public IP address.
The TCP port used by the application is added to the security group configuration, to let the traffic reach the application.
Now, a listener is added to the Application Load Balancer. This will expose the application through this service.
Finally, a target group is added to the Application Load Balancer. This will let the ALB forward the incoming traffic to the application port.
Besides the network configuration to use the same TCP port, there is the stickness attribute with the value true. This configuration let the ALB keeps the connection established from a client, which is required by a Plotly Dash application. This is very important, because there are two instances of the application and after a connection is established with one of the instances, it needs to maintain on the same instance.
Another important configuration here is the health check. This mechanism allows the target group of the ALB keep checking the healthy of each instance. This will guarantee the availability of the application.
The HTTP endpoint configured in the health mechanism needs to be created in the Plotly Dash app, as will be explained later.

Organizing the stacks:

These stacks need to be organized in the main file of the CDK project. So, just create an instance of each one, as describe in the following code.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
#!/usr/bin/env python3
import os

import aws_cdk as cdk

from plotly_dash_ecs.alb_stack import AlbStack
from plotly_dash_ecs.cluster_stack import ClusterStack
from plotly_dash_ecs.dash_app_service import DashAppService
from plotly_dash_ecs.ecr_stack import EcrStack
from plotly_dash_ecs.s3_stack import S3Stack
from plotly_dash_ecs.vpc_stack import VpcStack

app = cdk.App()
environment = cdk.Environment(account='<your AWS account ID>', region='<the desired AWS region>')

tagsInfra = {
"cost": "PlotlyDashInfra",
"team": "SiecolaCode"
}

tagsPlotlyDashApp = {
"cost": "PlotlyDashApp",
"team": "SiecolaCode"
}

ecr_stack = EcrStack(app, "Ecr",
env=environment,
tags=tagsInfra
)

vpc_stack = VpcStack(app, "Vpc",
env=environment,
tags=tagsInfra
)

cluster_stack = ClusterStack(app, "Cluster",
env=environment,
tags=tagsInfra,
props=vpc_stack.outputs
)
cluster_stack.add_dependency(vpc_stack)

alb_stack = AlbStack(app, "ALB",
env=environment,
tags=tagsInfra,
props=vpc_stack.outputs
)
alb_stack.add_dependency(vpc_stack)

s3_stack = S3Stack(app, "Bucket",
env=environment,
tags=tagsInfra
)

dash_app_service_props = {}
dash_app_service_props.update(ecr_stack.outputs)
dash_app_service_props.update(vpc_stack.outputs)
dash_app_service_props.update(cluster_stack.outputs)
dash_app_service_props.update(alb_stack.outputs)
dash_app_service_props.update(s3_stack.outputs)
dash_app_service_stack = DashAppService(app, "DashAppService",
env=environment,
tags=tagsPlotlyDashApp,
props=dash_app_service_props)
dash_app_service_stack.add_dependency(ecr_stack)
dash_app_service_stack.add_dependency(vpc_stack)
dash_app_service_stack.add_dependency(cluster_stack)
dash_app_service_stack.add_dependency(alb_stack)

app.synth()
Note that there some dependencies between some stacks, which is something normal to happen.

Creating the Plotly Dash app:

The Plotly Dash app presented here is a very simple dashboard, based on this tutorial.
The next sections will explain some important details, especially to host it on AWS.

The requirements.txt file:

These are the dependencies to be added to the project. The boto3 contains the AWS SDK library, to access the AWS S3 bucket.
1
2
3
4
5
6
7
dash
pandas
numpy
plotly-express
flask
flask-restful
boto3

The Dockerfile:

This is a very simple Dockerfile to create the Docker image of this application.
1
2
3
4
5
6
7
FROM python:3.9
COPY requirements.txt ./requirements.txt
RUN pip install -r requirements.txt
COPY . ./

EXPOSE 8050
CMD ["python", "app.py"]

The application code:

This is the application code to expose a very simple dashboard using the Plotly express library.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
import io
import os

from dash import Dash, html, dcc, callback, Output, Input
import plotly.express as px
import pandas as pd
from flask import Flask
from flask_restful import Resource, Api
import boto3

class HealthCheck(Resource):
def get(self):
return {'up': 'OK'}

server = Flask('plotly_dash')
app = Dash(server=server)
api = Api(server)
api.add_resource(HealthCheck, '/health')

BUCKET_NAME = os.getenv("BUCKET_NAME")

s3_client = boto3.client('s3')
object_s3 = s3_client.get_object(Bucket=BUCKET_NAME, Key="gapminder_unfiltered.csv")
object_csv = object_s3['Body'].read().decode('utf-8')

csv_file = io.StringIO(object_csv)

df = pd.read_csv(csv_file)

app.layout = html.Div([
html.H1(children='Title of Dash App', style={'textAlign':'center'}),
dcc.Dropdown(df.country.unique(), 'Canada', id='dropdown-selection'),
dcc.Graph(id='graph-content')
])

@callback(
Output('graph-content', 'figure'),
Input('dropdown-selection', 'value')
)

def update_graph(value):
dff = df[df.country==value]
return px.line(dff, x='year', y='pop')

if __name__ == '__main__':
app.run_server(debug=True, host='0.0.0.0', port=8050)
The HealthCheck class creates an endpoint to be used by the health check mechanism, as explained in the AWS ECS Service section of this article.
The Boto3 S3 client is initialized to fetch the CSV file, that will be used as the dashboard data source.
This CSV file is read using the Pandas library to create a data frame.
Finally, this data frame will compose the dashboard of this application.

Deploying the application:

To deploy this application, first the ECR repository must be deployed using the following command:
cdk deploy Ecr --require-approval never
Then, the Docker image of the application must be generated and pushed to the created ECR repository.
Next, deploy the S3 bucket, with the following command:
cdk deploy Bucket --require-approval never
Now, upload this CSV file to the bucket, using the AWS S3 console.
Finally, the whole infrastructure can be created with the following command:
cdk deploy --all --require-approval never

Accessing the dashboard hosted on AWS:

After the deployment process, the ALB endpoint will be outputted by the CDK process. Use this endpoint with the 8050 port to access the application hosted on AWS.

Conclusion:

This article explained a few steps to host a very basic dashboard application built with Plotly Dash.
The AWS CDK helps a lot when building the required infrastructure, which is a little bit complex.
 

1 Comment