Containers

Streaming Kubernetes Events in Slack

IT operations teams know that detecting an issue early on can help them avert downtime and cascading failures. Many teams stay on top of infrastructure events by using built-in alert management capabilities in monitoring tools such as Prometheus and Amazon CloudWatch. However, these alert rules are configured centrally in monitoring tools, and engineers often receive too many event notifications without an easy way to unsubscribe. This deluge of events can overwhelm engineers while critical events go unnoticed.

Many teams adopt messaging tools such as Slack to keep track of events in their environments. A real-time messaging tool like Slack allows engineers to receive system events as notifications in a channel. Channel members can then reduce the noise by setting alerts for the relevant event types. Developers can also create elaborate event-based workflows using bots. Tools like Slack make collaborative troubleshooting easier. Engineers can keep a tab on recent infrastructure events and participate in troubleshooting without switching windows.

Teams can also use Slack to see what’s happening in their Kubernetes clusters. This post describes how you can send events from your Kubernetes cluster to a Slack channel using BotKube, a messaging bot for monitoring and debugging Kubernetes clusters. BotKube also allows you to run kubectl commands like kubectl get pods right from Slack.

Solution

Kubernetes provides a running log of recent object state changes. Events are extremely helpful when troubleshooting. In fact, events are the first thing most people look at. You can use kubectl get events to see the recently created events in your cluster. Tracking events over time helps you gain a better understanding of your Kubernetes cluster’s behavior and prevent potential service disruptions.

By default, Kubernetes events expire after 60 minutes. To retain events for a longer period of time, you’ll need a tool like k8s-event-logger that forwards events to another service for retention. BotKube works in a similar fashion. BotKube watches Kubernetes events and forwards them to Slack, Microsoft Teams, and MatterMost.

Diagram showing the BotKube Architecture

BotKube Architecture

We’ll start by creating an Amazon EKS cluster and then deploy a sample application. Next, let’s install BotKube and configure it to forward events to a Slack channel. Later, you will interact with your EKS cluster from the Slack channel.

Diagram of the BotKube - Slack Monitoring Workspace

Prerequisites

You will need the following to complete the walkthrough:

Create an EKS cluster

Let’s start by setting a few environment variables:

export MONBOT_AWS_REGION=us-west-2 #<-- Change this to match your region
export MONBOT_ACCOUNT_ID=$(aws sts get-caller-identity --query 'Account' --output text)
export MONBOT_EKS_CLUSTER_NAME=eks-monitoring-bot-cluster

Create a cluster using eksctl:

eksctl create cluster \
  --name $MONBOT_EKS_CLUSTER_NAME \
  --region $MONBOT_AWS_REGION \
  --managed

Creating a cluster can take up to 10 minutes. When the cluster is ready, proceed to the next steps.

Deploying a sample app

Create a sample app:

kubectl apply -f https://raw.githubusercontent.com/aws-samples/containers-blog-maelstrom/main/sample-app.yaml

This manifest will create a Kubernetes deployment with three replicas.

Configure Slack

We recommend creating a test Slack workspace for BotKube for this tutorial. Follow the steps below to create a new Slack workspace:

  1. Go to your Slack application and click on the + button located at the bottom left side and select Create a new workspace. Alternatively, you can create a new workspace using Slack web client.
  2. This will open a page in your browser. Enter your email ID and click on Continue.
  3. On the next page, enter the passcode you received in your email. You will then see a pop-up on your browser to allow the site to open the Slack link with Slack. Select the checkbox and  choose Open Link.
  4. You will now get a series of dialog boxes to enter a few details about your new Slack workspace.
    • In the Step 1 dialog box, enter BotKubeonEKS for the name of the company or team.
    • In the Step 2 dialog box, enter monitoringoneks for the project name.
    • In Step 3, choose Skip this step.

You will now see your new workspace BotKubeOnEKS with a channel monitoringoneks opened in your Slack application.

If you are using an existing Slack workspace, create a new channel named monitoringoneks.

Install BotKube Slack app to your Slack workspace

BotKube needs permissions to send messages to the Slack channel in your workspace. You can add BotKube to your workspace by navigating to this link.

Select the newly created Slack workspace BotKubeonEKS on the top right corner of the webpage and choose Allow. Copy the BOT Access Token you see on the next page.

Image of the BOT Access token screen

Let’s store the access token in an environment variable:

#Replace the token value noted down previously to below env. variables
export SLACK_API_TOKEN_FOR_THE_BOT="xoxb-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" 
export SLACK_CHANNEL_NAME="monitoringoneks"

Add BotKube user to a Slack channel

You will now see a new bot user called BotKube in your Slack workspace. Select the channel monitoringoneks and enter @BotKube to invite this bot to receive notifications from the EKS cluster.

Installing BotKube backend in Kubernetes cluster

We’ll now use Helm to install BotKube in the EKS cluster.

Add infracloudio helm chart repository:

helm repo add infracloudio https://infracloudio.github.io/charts
helm repo update

Install BotKube Helm chart:

helm install --version v0.12.4 botkube --namespace  eks-sample-app \
  --set communications.slack.enabled=true \
  --set communications.slack.channel=$SLACK_CHANNEL_NAME  \
  --set communications.slack.token=$SLACK_API_TOKEN_FOR_THE_BOT \
  --set config.settings.clustername=$MONBOT_EKS_CLUSTER_NAME \
  --set config.settings.kubectl.defaultNamespace=eks-sample-app \
  --set config.settings.kubectl.enabled=true \
  --set config.settings.kubectl.restrictAccess=true \
  --set image.repository=infracloudio/botkube \
  --set image.tag=v0.12.4 \
  infracloudio/botkube

Enable the option config.settings.kubectl.restrictAccess to limit command executions to the configured Slack channel only.

Verify that the BotKube pod is running:

kubectl get deployments -n eks-sample-app

NAME                                          READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/botkube                       1/1     1            1           29s

Interact with Kubernetes through Slack

Now you can switch to Slack and execute kubectl commands from the monitoringoneks Slack channel. Send a ping message to BotKube, and it will respond with a pong message from all the configured Kubernetes clusters:

Image of the BotKube pong response
Running @BotKube commands list will give you the list of commands BotKube can execute:

Image shows BotKube commands list

Run @BotKube < kubectl command without kubectl prefix > on the Slack channel to get kubectl response from the configured Amazon EKS cluster. Run @BotKube get namespaces to list all namespaces in the Amazon EKS cluster:

Image of botkube get namespaces list

Run @BotKube get pods to list all the pods in the namespace eks-sample-app that is configured as part of the helm command for BotKube deployment:

Image showing BotKube get namespaces listing all namespaces in the Amazon EKS cluster

Filters

BotKube can also help you identify common Kubernetes misconfigurations, and filters enable you to customize messages when a condition is met. For example, BotKube’s ImageTagChecker filter can check if a pod in your cluster uses the :latest image tag and send a customized message to Slack.

Run @BotKube filters list to get a list of available filters:

Image shows list of BotKube filters

Filters are enabled by default. You can disable them by running @BotKube filters disable <FILTER NAME>:

List of bot kube filters

To enable the ImageTagChecker filter back, run the command @BotKube filters enable PodLabelChecker:

Image showing running the command BotKube filters enable podlabel checker

You can test this by setting the image tag in our deployment to :latest. Edit the sample deployment by executing kubectl edit deployment eks-sample-linux-deployment -n eks-sample-app, scroll down to update the image tag to latest for the container image, and save the file. Switch back to Slack, and you will see the following alert in your channel:

Image shows a BotKube Alert

Retaining Kubernetes events

Customers that want to retain Kubernetes events can export Kubernetes events to a centralized log retention service such as Amazon CloudWatch Container Insights or Amazon OpenSearch Service.

There are several open-source utilities that read Kubernetes events and emit them as logs. A cluster-wide log aggregation tool like Fluent Bit or Fluentd can then ship those events to a centralized service for retention. Two popular implementations are:

If you use an AWS Partner solution for log aggregation, be sure to explore if there are any native options. For example, Datadog, Instana, and Dynatrace provide a built-in way to collect Kubernetes events.

Cleanup

Use the following commands to delete resources created during this post:

helm delete botkube -n eks-sample-app 
kubectl delete ns eks-sample-app
eksctl delete cluster --name $MONBOT_EKS_CLUSTER_NAME --region $MONBOT_AWS_REGION

Use the following steps to clean up your Slack workspace:

  • Go to Slack manage apps page.
  • Select BotKube and then select the Remove App button.

Conclusion

Understanding the events in your cluster can give you insights into misconfigurations, bottlenecks, and errors in your Kubernetes cluster. This post describes how you can monitor your Kubernetes cluster events using Slack. We hope this post helps you detect problems early and improve the reliability and availability of your cluster and its applications.

Elamaran Shanmugam

Elamaran Shanmugam

Elamaran (Ela) Shanmugam is a Sr. Container Specialist Solutions Architect with AWS. Ela is a Container, Observability and Multi-Account Architecture SME and helps customers design and build scalable, secure and optimized container workloads on AWS. His passion is building and automating infrastructure to allow customers to focus more on their business. He is based out of Tampa, Florida and you can reach him on twitter @IamElaShan.

Jayaprakash Alawala

Jayaprakash Alawala

Jayaprakash Alawala is a Sr Container Specialist Solutions Architect at AWS. He helps customers on Applications Modernization and build large scale applications leveraging various AWS services. He has expertise in the area of Containers, Micro-services, Dev Ops, Security, Cost Optimization including EC2 Spot, Technical Training. Outside of work, he loves spending time reading and traveling. You can reach him on twitter @JP_Alawala

Re Alvarez-Parmar

Re Alvarez-Parmar

In his role as Containers Specialist Solutions Architect at Amazon Web Services, Re advises engineering teams with modernizing and building distributed services in the cloud. Prior to joining AWS, he spent more than 15 years as Enterprise and Software Architect. He is based out of Seattle. Connect on LinkedIn at: linkedin.com/in/realvarez/