The Internet of Things on AWS – Official Blog

Monitor AWS IoT connections in near-real time using MQTT LWT

In a connected device, you may need to monitor devices in near-real time to detect error and mitigate actions, Last Will and Testament (LWT) method for MQTT addresses this challenge. LWT is a standard method of MQTT protocol specification that allows to detect abrupt disconnects of devices and to notify other clients about this abrupt disconnections.

IoT devices are often used in environments with unreliable network connectivity and/or devices might disconnect due to lack of power supply, low battery, loss of connection, or any other reason. This will cause abrupt disconnections from the broker without knowing if the disruption was forced by the client or truly abrupt, This is where LWT let’s a client provide a testament along with its credentials when connecting to the AWS IoT Core. If the client disconnects abruptly at some point later (i.e. power loss), it can let AWS IoT Core deliver a message to other clients and inform them of this abrupt disconnect and deliver LWT message.

MQTT Version 3.1.1 provides an LWT feature as part of the MQTT message and is supported by AWS IoT Core, so any client which disconnects abruptly can specify its LWT message along with the MQTT topic when it connects to the broker. When the client disconnects abruptly, the broker (AWS IoT Core) will then publish the LWT message provided by that client at connection time to all the devices which subscribed to this LWT topic.

The MQTT LWT feature enables you to monitor AWS IoT connections in near-real time to help you to take corrective actions. You can react to abrupt disconnection events by verifying status, restoring connections, and carrying out either edge-based (device side) actions or cloud-based actions to investigate and mitigate this abrupt disconnect of the device.

In this blog we will go through following steps:

  1. A simulated ‘lwtThing’ device connects to AWS IoT Core by giving Keep-alive time
  2. The ‘lwtThing’ device, on the connection to AWS IoT Core, provides the following:
    1. Topic for LWT (i.e. /last/will/topic)
    2. LWT message
    3. QoS type either 0 or 1
  3. ‘lwtThing’ device disconnects abruptly from AWS IoT Core
  4. AWS IoT Core detects this and publishes the LWT message to all the subscribers of the topic (i.e. /last/will/topic)
  5. Rules for AWS IoT (rule engine) picks up the trigger on the topic and invokes Amazon Simple Notifications Service (SNS)
  6. Amazon SNS sends a notification email

We will setup a virtual environment using a CloudFormation template (by using AWS IoT workshop setup instructions) and launch a virtual IoT thing (naming ‘lwtThing’) to create a real life simulation of the physical device.

Architecture

We will simulate the edge device using a script provided below and send the LWT message, showing abrupt disconnects and triggering AWS IoT rules and subsequently invoking Amazon SNS to send emails.

Setup

We will use the following workshop setup to get quickly bootstrapped and test LWT. You can use the following link to setup AWS Cloud9 environment (pick any region closest to your location).

Once we have the environment setup using the workshop AWS CloudFormation pre-provided template, lets begin testing the ungraceful disconnects with AWS IoT Core (AWS MQTT broker on the cloud).

Now open the Cloud9 terminal (see here) and let’s setup Python SDK for us to use.

Create a folder for us to use to connect our IoT thing using the Cloud9 terminal window.

mkdir -p /home/ubuntu/environment/lwt/certs
cd /home/ubuntu/environment/lwt/
Bash

Setup Python IoT SDK using full instructions here.

Quick instructions:

git clone https://github.com/aws/aws-iot-device-sdk-python.git
cd aws-iot-device-sdk-python
python setup.py install
Git

Now, to setup your AWS IoT Thing follow steps outlined here.

Once we have created the thing, let’s upload these certificates in our Cloud9 instance for us to connect from there.

Upload the newly created certificates and RootCA into following folder (created earlier)

/home/ubuntu/environment/lwt/certs
Bash

LWT thing messages

Let’s copy the Python code to Cloud9 and execute as the simulated AWS IoT thing.

Copy the following commands:

touch lwtTest.py
Bash

Open the file and copy the following code into it.

'''
/*
 * # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
 * # SPDX-License-Identifier: MIT-0
 * 
 */


 '''
from AWSIoTPythonSDK.MQTTLib import AWSIoTMQTTClient
import logging
import time
import argparse
import json

AllowedActions = ['both', 'publish', 'subscribe']

# Custom MQTT message callback
def customCallback(client, userdata, message):
    print("Received a new message: ")
    print(message.payload)
    print("from topic: ")
    print(message.topic)
    print("--------------\n\n")

# LWT JSON payload
payload ={
  "state": {
    "reported": {
      "last_will": "yes",
      "trigger_action": "on",
      "client_id": "lwtThing"
        }
    }
}
 
# conversion to JSON done by dumps() function
jsonPayload = json.dumps(payload)
 
# printing the output
#print(jsonPayload)


# Read in command-line parameters
parser = argparse.ArgumentParser()
parser.add_argument("-e", "--endpoint", action="store", required=True, dest="host", help="Your AWS IoT custom endpoint")
parser.add_argument("-r", "--rootCA", action="store", required=True, dest="rootCAPath", help="Root CA file path")
parser.add_argument("-c", "--cert", action="store", dest="certificatePath", help="Certificate file path")
parser.add_argument("-k", "--key", action="store", dest="privateKeyPath", help="Private key file path")
parser.add_argument("-p", "--port", action="store", dest="port", type=int, help="Port number override")
parser.add_argument("-w", "--websocket", action="store_true", dest="useWebsocket", default=False,
                    help="Use MQTT over WebSocket")
parser.add_argument("-id", "--clientId", action="store", dest="clientId", default="basicPubSub",
                    help="Targeted client id")
parser.add_argument("-t", "--topic", action="store", dest="topic", default="sdk/test/Python", help="Targeted topic")
parser.add_argument("-m", "--mode", action="store", dest="mode", default="both",
                    help="Operation modes: %s"%str(AllowedActions))
parser.add_argument("-M", "--message", action="store", dest="message", default="AWS IoT Thing connected message to IoT Core",
                    help="Message to publish")

args = parser.parse_args()
host = args.host
rootCAPath = args.rootCAPath
certificatePath = args.certificatePath
privateKeyPath = args.privateKeyPath
port = args.port
useWebsocket = args.useWebsocket
clientId = args.clientId
topic = args.topic

if args.mode not in AllowedActions:
    parser.error("Unknown --mode option %s. Must be one of %s" % (args.mode, str(AllowedActions)))
    exit(2)

if args.useWebsocket and args.certificatePath and args.privateKeyPath:
    parser.error("X.509 cert authentication and WebSocket are mutual exclusive. Please pick one.")
    exit(2)

if not args.useWebsocket and (not args.certificatePath or not args.privateKeyPath):
    parser.error("Missing credentials for authentication.")
    exit(2)

# Port defaults
if args.useWebsocket and not args.port:  # When no port override for WebSocket, default to 443
    port = 443
if not args.useWebsocket and not args.port:  # When no port override for non-WebSocket, default to 8883
    port = 8883

# Configure logging - we will see messages on STDOUT
logger = logging.getLogger("AWSIoTPythonSDK.core")
logger.setLevel(logging.DEBUG)
streamHandler = logging.StreamHandler()
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
streamHandler.setFormatter(formatter)
logger.addHandler(streamHandler)

# Init AWSIoTMQTTClient
myAWSIoTMQTTClient = None
if useWebsocket:
    myAWSIoTMQTTClient = AWSIoTMQTTClient(clientId, useWebsocket=True)
    myAWSIoTMQTTClient.configureEndpoint(host, port)
    myAWSIoTMQTTClient.configureCredentials(rootCAPath)
else:
    myAWSIoTMQTTClient = AWSIoTMQTTClient(clientId)
    myAWSIoTMQTTClient.configureEndpoint(host, port)
    myAWSIoTMQTTClient.configureCredentials(rootCAPath, privateKeyPath, certificatePath)

#########
# Will Topic
# Input parameters are: Topic, Last will message and finally QoS
myAWSIoTMQTTClient.configureLastWill('/last/will/topic', jsonPayload, 0)
#########


# Connect and subscribe to AWS IoT
# keep-alive connect parameter - setting 30s
myAWSIoTMQTTClient.connect(30) 
print("Connected!")
loopCount = 1
while loopCount < 2:
    if args.mode == 'both' or args.mode == 'publish':
        message = {}
        message['message'] = args.message
        messageJson = json.dumps(message)
        myAWSIoTMQTTClient.publish(topic, messageJson, 1)
        if args.mode == 'publish':
            print('Published topic %s: %s\n' % (topic, messageJson))
            loopCount +=1
#lets put the device to sleep so it creates disconnect after 60s
print("--- Putting device to sleep now, so IoT core keep-alive time expires. ---")
print("--- We will abruptly disconnect the device after 60seconds. ---")
time.sleep(60)

Python

Let’s look at the following line which is doing all the work on setting the LWT Topic, JSON payload, and what level of QoS we are using.

myAWSIoTMQTTClient.configureLastWill('/last/will/topic', jsonPayload, 0)
JSON
  • Topic used is : /last/will/topic
  • QoS (Quality of Service) is: 0
  • JSON Payload variable contains following payload:
{
  "state": {
    "reported": {
      "last_will": "yes",
      "trigger_action": "on",
      "client_id": "lwtThing"
        }
    }
}
JSON

The above setup defines the LWT topic as well as what topic to post this message to, which will be understood and executed by AWS IoT rules once the device disconnects abruptly (The “Last Will” is
published by the server when its connection to the client is unexpectedly lost.) An AWS IoT rule will trigger the action on Amazon SNS to send an email upon its execution. You can read more on the other options in the SDK document.

We are setting keep-alive to 30seconds at connection to AWS IoT core so it keeps the session alive for the given time. Once the time runs out, the session is expired.

At the expiration of the session, we set the device to sleep for 60 seconds, Once 60 seconds finishes we abruptly disconnects the devices which in turn generates Last Will Testament (LWT) trigger from AWS IoT Core and message gets published to all topic subscribers who are listening to this LWT topic.

Setup Amazon SNS

Let’s setup Amazon SNS and configure it to send email as its notification, From the Amazon SNS console do the following:

  • Select Topics
    • Select Create topic
      • Select Standard
      • Select Name (i.e. lwtSNSTopic)
      • Select Display name (i.e. lwtSNSTopic)
      • Select Create topic
    • Once topic is created
      • Select Create subscription
      • Select Email from Protocol dropdown
      • For Endpoint give the email address you would like to use
      • Select Create subscription

You should receive an email. Please confirm the subscription. If you have not confirmed the subscription, you will not be able to receive any emails.

Setup Rules for AWS IoT Core

From the AWS IoT Core console do the following:

  • Select Act
    • Select Rules
    • Select Create
    • Give a name (i.e. lastWillRule) and description (My first LWT rule)
    • In Rule query statement enter following:
      • SELECT * FROM '/last/will/topic' where state.reported.last_will = 'yes' and state.reported.trigger_action = 'on'
    • In Actions section
      • Select Add Action
      • Select Send a message to an SNS push notification
      • Select Configure action
      • In SNS target Select the SNS topic you created earlier (i.e. lwtSNSTopic)
      • In Message format, Select JSON
      • Select Create Role
      • Give it a name (i.e. lwtRuleRole)
      • Select Add action

Let’s add another action here, we will republish the incoming LWT message to another topic to verify its incoming.

    • In Actions section
      • Select Add Action
      • Select Republish a message to an AWS IoT topic
      • Select Configure action
      • Under Topic
        • Select /lwt/executed
        • we can leave the Quality of Service default
        • For ‘Choose or create a role to grant AWS IoT access to perform this action
          • Select lwtRuleRole
          • Select Update role
        • Select Add action

This concludes our rules setup section, let’s proceed and setup sending LWT messages and execute our setup.

Sending LWT messages

Before we execute the simulated device (using python code) let’s subscribe to the topic in the AWS IoT Core console.

Figure 2

Now that we have everything in place, let’s execute the IoT Thing (simulated using Python code). You can use the sample execution command which may differ for you as your thingID might be different or your certificates path might be in a different location.

Sample command (replace xxxx with relevant values for your setup):

python lwtTest.py -e xxxxxxxxxxxxxx-ats.iot.us-east-1.amazonaws.com -r /home/ubuntu/environment/lwt/certs/AmazonRootCA1.pem -c /home/ubuntu/environment/lwt/certs/xxxxxxxxxxxxxxxxxxxxxxxxxxxx-certificate.pem.crt -k /home/ubuntu/environment/lwt/certs/xxxxxxxxxxxxxxxxxxxxxxxxxxxx-private.pem.key -id lwtThing -t /lwt/connected/topic -m publish
Bash

What we are passing as input parameters to the code is as follows:

  • -e is referring to the end point of AWS IoT Core
  • -r is the full file path where our Amazon Root CA is located
  • -c is the full file path for our certificate location
  • -k is the full file path for our private key
  • -id is the ClientID we are using to send to AWS IoT Core (you should match this to what you have created the Thing in IoT Core as)
  • -t is the topic we are providing to publish on when it first connects to AWS IoT Core
  • -m is the mode we have defined in the code and we will use publish for this test. (available modes are: publish, subscribe or both)

Let’s look at the execution of the command, we should see that LWT is getting configured and what message we published to AWS IoT Core. You will also see abrupt disconnect after 60 seconds.

Figure 3

Switching over to the AWS IoT Core console to see incoming messages, subscribe to following topics:

  • Topic used for republishing of the message when the rule is executed (using as debug): /lwt/executed
  • Topic used for when LWT message is published upon ungraceful disconnect of a client: /last/will/topic
  • Topic /lwt/connected/topic you can see messages posted by the thing. This occurs when the client is connected to AWS IoT Core and sends the message to inform the broker I’m here and connected.

Figure 4

Under topic /last/will/topic we can see the message executed by AWS IoT Core once the device ungracefully disconnects.

Figure 5

When AWS IoT rule is executed for LWT we can see within topic /lwt/executed payload is published to this topic too, we configured this topic earlier to repost to when AWS IoT rule is executed upon device abrupt disconnection.

Figure 6

Upon successful execution of the AWS IoT rule we also triggered Amazon SNS email notification and if you have configured this correctly earlier you will see similar email in your inbox.

Figure 7

Conclusion

In this blog we looked at how you can use AWS IoT Core to detect errors and failures of a device and abrupt disconnections, and upon abrupt disconnection triggering Amazon SNS email notification to support team who can quickly investigate and mitigate failure and resolve issues at large. If the thing closes connection properly or in a recommended manner, then AWS IoT Core will disregard the LWT which we set at the time of connection. By using LWT, we can implement many error handling scenarios where the connection of the client drops and where there is a dependency of other clients relying on this connection chain. For example, when an industrial gateway responsible for gathering sensor data across the factory floor experiences an abrupt disconnection from AWS IoT Core, then you can monitor those disconnections and take corrective measures to reduce second degree impact downstream. You can read more here about MQTT and SNS.

About the author

Syed Rehan is a Sr. Global specialist Solutions Architect at Amazon Web Services (AWS) and is based out of London. He is covering global span of customers and supporting them as lead IoT Solution Architect. Syed has in-depth knowledge of IoT and cloud and works in this role with global customers ranging from start-up to enterprises to enable them to build IoT solutions with the AWS eco system.