Containers

Collecting data from edge devices using Kubernetes and AWS IoT Greengrass V2

Kubernetes is open-source software that allows you to deploy and manage containerized applications at scale. It manages clusters of Amazon Elastic Compute Cloud (Amazon EC2) compute instances and runs containers on those instances with processes for deployment, maintenance, and scaling. Using Kubernetes, you can run any type of containerized application using the same toolset on premises and in the cloud. Increasingly, companies also want to orchestrate containers with Kubernetes at the edge on resource-constrained IoT devices. For platform teams, it is important to use the same tools for automation which they are already using in their existing environments. With this approach, developers can deploy their containerized applications, whether it’s in the cloud, on-premises, or at the edge, using the exact same APIs, tools, and capabilities.

One possible application for this implementation is connected vehicles. AWS has implemented a connected vehicle solution which is using AWS IoT Greengrass Core and be combined with the solution outlined in this blog post. The connected vehicle solution provides secure vehicle connectivity to the AWS Cloud and includes capabilities for local computing within vehicles, sophisticated event rules, and data processing and storage. The solution features fast and robust data ingestion; highly reliable and durable storage of vehicle telemetry data; simple, scalable big data services for analyzing the data; and global messaging and application services to connect with consumers.

This post shows how to set up a Kubernetes cluster using k3s within an edge node (in our case, a Raspberry Pi 4), install the AWS Systems Manager Agent (SSM Agent), and deploy AWS IoT Greengrass V2 using standard Kubernetes tools like kubectl.

A Raspberry Pi 4

Overview of solution

K3s is a highly certified Kubernetes distribution by Rancher designed for resource-constrained Internet of Things (IoT) devices. It is packaged as a single <40 MB binary that reduces the dependencies and steps needed to install and maintain the Kubernetes cluster. You can also import a k3s cluster into Rancher, as indicated in the picture below. This AWS Quick Start shows how to set up Rancher on AWS.

Schematic architecture of the implementationAWS IoT Greengrass is an IoT open-source edge runtime and cloud service that helps you build, deploy, and manage device software. Customers use AWS IoT Greengrass for their IoT applications on millions of devices in homes, factories, vehicles, and businesses.

AWS Systems Manager (formerly known as SSM) is an AWS service that you can use to view and control your infrastructure on AWS. Using the Systems Manager console, you can view operational data from multiple AWS services and automate operational tasks across your AWS resources. Systems Manager helps you maintain security and compliance by scanning your managed instances and reporting on (or taking corrective action on) any policy violations it detects.

A managed instance is a machine configured for use with Systems Manager. Systems Manager also helps you configure and maintain your managed instances. Supported machine types include Amazon EC2 instances, on-premises servers, and virtual machines (VMs), including VMs in other cloud environments. Supported operating system types include Windows Server, macOS, Raspbian, and multiple distributions of Linux.

Prerequisites

For this walkthrough, you should have the following prerequisites:

  • An AWS account
  • A properly installed and configured AWS CDK
  • An environment to deploy the AWS CDK application
  • A Raspberry Pi 4 with a correctly configured AWS CLI

Create the infrastructure

In order to create the necessary infrastructure, we need to execute different steps in AWS and on the Raspberry Pi.

Step 1: Set up the AWS CLI on Raspberry Pi

Initially, you have to set up and configure the AWS CLI on your Raspberry Pi as described in the documentation.

Step 2: Use AWS Cloud9 to set up the necessary infrastructure in AWS

After successfully setting up the Raspberry Pi, you need to clone the demo application code in order to create all necessary components in the AWS Cloud using the CDK application. AWS Cloud9 makes the setup easy. AWS Cloud9 is a cloud-based integrated development environment (IDE) that lets you write, run, and debug your code with just a browser. It comes with the AWS tools, Git, and Docker installed.

Once the repo is cloned, you need to execute the following commands to start the CDK application (follow this documentation for the necessary steps to deploy the CDK).

cd cdk_app/k3s-bootstrap
npm install
cdk deploy 

The application performs the following steps:

  1. Create an Amazon Simple Storage Service (Amazon S3) bucket to store the Kubernetes configuration, which was automatically created during k3s cluster bootstrapping.
  2. Store the name of the S3 bucket in AWS Systems Manager Parameter Store under /k3s/kubernetes/s3-bucket.
  3. Create an IAM role to set up AWS Systems Manager for hybrid environments.
  4. Perform an activation and store the activation ID and activation code in AWS Secrets Manager, the name is k3s-activation-secret.

As already mentioned, this is just a proof-of-concept implementation and only covers the activation for one device. If multiple devices should be activated, an identifier like the device ID has to be introduced for multiple hybrid activations.

After the CDK application has run successfully and the necessary infrastructure is up and running, you have to set up the Raspberry Pi using bash-scripts from the Git repository. Under aws-kubernetes-edge-greengrass/device/bootstrapping you can find two scripts — bootstrap.sh and install_k3s.sh. The bootstrap.sh script contains logic to change cgroup-settings to run kubernetes on this device and to install and configure the SSM Agent using the following steps:

  1. Update Linux on Raspberry Pi.
  2. Install Python 3, Docker, and jq.
  3. Add cgroup-flags to the last line of /boot/cmdline.txt.
  4. Read the SSM Activation Code and SSM Activation Id from AWS Secrets Manager.
  5. Download and configure the SSM Agent using the activation data.
  6. Reboot the device so that the cgroups changes can take effect.

You can run the script using the following statements:

cd aws-kubernetes-edge-greengrass/device/bootstrapping
chmod a+x *.sh
./bootstrap.sh

Raspberry Pi in Fleet Manager after activation

After rebooting the Raspberry Pi, you can execute install_k3s.sh to install k3s. This script installs k3s using the following steps:

  1. If the k3s-process isn’t running, it will be installed.
  2. The IP address of the Raspberry Pi is determined and stored in Parameter Store under /k3s/kubernetes/ip.
  3. In the Kubernetes-configuration, the local address is replaced with the external IP of the device and uploaded to the S3 bucket, which has been created by the AWS CDK application.

You can run the script using the following statements:

cd aws-kubernetes-edge-greengrass/device/bootstrapping
chmod a+x *.sh
./install_k3s.sh

Now the k3s-cluster is up and running, and you can deploy AWS IoT Greengrass to kubernetes with aws-kubernetes-edge-greengrass/deployment/greengrass-v2-deployment.yaml using the following commands after downloading the kubernetes configuration from the Amazon S3 bucket, which has been created by the CDK application.

export KUBECONFIG=~/<download-folder>/kubeconfig-k3s
kubectl apply -f aws-kubernetes-edge-greengrass/deployment/greengrass-v2-deployment.yaml

The export-statement sets the kubernetes-context to the file that has been created and uploaded to the S3 bucket that has been created by the CDK application. Besides the deployment-file, an additional template-file called greengrass-v2-deployment-template.yaml can be used for a custom configuration. The deployment is configured to provision an AWS IoT thing, an AWS IoT thing group, an IAM role, and an AWS IoT role alias. All logs are written to /home/pi/greengrass/v2/logs as this path is mounted as /greengrass/v2/logs into the container.

Due to the fact that Raspberry Pi is ARM-based and not using x86 CPUs, we also need to build an ARM-based container image. Below is a Dockerfile that uses arm32v7/openjdk:11-slim as the base image. We are using the OpenJDK base image here because AWS IoT Greengrass version 2 is based on Java. During the build, the AWS IoT Greengrass nucleus is downloaded and unzipped. Several environment variables are set that can be overwritten during deployment. After creating a user and group to run the Greengrass-process, a bash-script (greengrass-entrypoint.sh) that bootstraps the Java-process is started.

FROM arm32v7/openjdk:11-slim

RUN apt-get update \
&& apt-get --force-yes -y upgrade

ENV GREENGRASS_RELEASE_VERSION=2.1.0
ENV GREENGRASS_ZIP_FILE=greengrass-${GREENGRASS_RELEASE_VERSION}.zip
ENV GREENGRASS_RELEASE_URI=https://d2s8p88vqu9w66.cloudfront.net/releases/${GREENGRASS_ZIP_FILE}
ENV GREENGRASS_ZIP_SHA256=${GREENGRASS_ZIP_FILE}.sha256

RUN apt-get install -y python3-pip tar unzip wget sudo procps \
&& wget $GREENGRASS_RELEASE_URI \
&& mkdir -p /opt/greengrassv2 /greengrass/v2 && unzip $GREENGRASS_ZIP_FILE -d /opt/greengrassv2 && rm greengrass-2.1.0.zip \
&& rm -rf /var/lib/apt/lists/*

# Set up Greengrass v2 execution parameters
ENV GGC_ROOT_PATH=/greengrass/v2 \
TINI_KILL_PROCESS_GROUP=1 \
PROVISION=false \
TES_ROLE_NAME=default_tes_role_name \
TES_ROLE_ALIAS_NAME=default_tes_role_alias_name \
COMPONENT_DEFAULT_USER=default_component_user \
DEPLOY_DEV_TOOLS=false \
LOG_LEVEL=$LOG_LEVEL \
INIT_CONFIG=$INIT_CONFIG

RUN groupadd --gid 998 ggc_group && useradd --uid 999 --gid ggc_group --shell /bin/bash --create-home ggc_user

# Entrypoint script to install and run Greengrass
COPY "greengrass-entrypoint.sh" /

RUN env && chmod +x /greengrass-entrypoint.sh

CMD ["sh", "/greengrass-entrypoint.sh" ]

# Expose port to subscribe to MQTT messages, network port
EXPOSE 8883

The following code shows an example deployment of the AWS IoT Greengrass container image we’ve created using the script above. Initially, we create a namespace called greengrass and mount two important volumes: /home/pi/.aws and /home/pi/greengrass/v2/logs. The first directory is used to read the local AWS credentials for communication with AWS IoT. In order to write log files directly to the local storage of the IoT device, we mount the log-directory into the container. The most important environment variable that is overwritten during deployment is PROVISION. If the value is set to true, AWS IoT can generate and securely deliver device certificates and private keys to your devices when they connect to AWS IoT for the first time. AWS IoT provides client certificates that are signed by the Amazon root certificate authority (CA).

apiVersion: v1
kind: Namespace
metadata:
name: greengrass
---
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: greengrass
name: greengrass-deployment
spec:
selector:
matchLabels:
app: greengrass
replicas: 1
template:
metadata:
labels:
app: greengrass
spec:
containers:
- name: greengrass
image: public.ecr.aws/f3r7z4u4/ggv2-arm:1.0
env:
- name: LOG_LEVEL
value: "INFO"
- name: GGC_ROOT_PATH
value: "/greengrass/v2"
- name: AWS_REGION
value: "eu-west-1"
- name: PROVISION
value: "true"
- name: THING_NAME
value: "k3s_gg_core"
- name: THING_GROUP_NAME
value: "k3s_gg_core_group"
- name: TES_ROLE_NAME
value: "k3s_TokenExchangeRole"
- name: TES_ROLE_ALIAS_NAME
value: "k3s_TokenExchangeRoleAlias"
- name: COMPONENT_DEFAULT_USER
value: "ggc_user:ggc_group"
- name: DEPLOY_DEV_TOOLS
value: "false"
ports:
- containerPort: 8883
volumeMounts:
- mountPath: /root/.aws
name: credentials
- mountPath: /greengrass/v2/logs
name: logs
volumes:
- name: credentials
hostPath:
path: /home/pi/.aws
type: Directory
- name: logs
hostPath:
path: /home/pi/greengrass/v2/logs
type: Directory

Cleaning up

To avoid incurring future charges, delete the resources using the following command:

cdk destroy

Conclusion

In this post, we’ve described how to set up an edge device like Raspberry Pi 4 to run k3s and deploy AWS IoT Greengrass V2 into Kubernetes. The device is managed by AWS Systems Manager using a hybrid activation and is visible in the Fleet Manager. This setup can be included with other AWS solutions to collect and analyze metrics data.

We hope we’ve given you some ideas on how you can run your workloads, including AWS IoT Greengrass at edge, in a standardized way using Kubernetes APIs. Feel free to submit enhancements to the sample application in the source repository.

Sascha Moellering

Sascha Moellering

Sascha Möllering has been working for more than six years as a Solutions Architect and Solutions Architect Manager at Amazon Web Services EMEA in the German branch. He shares his expertise with a focus on Automation, Infrastructure as Code, Distributed Computing, Containers, and JVM in regular contributions to various IT magazines and blogs. He can be reached at smoell@amazon.de.