AWS Open Source Blog

Easily Running Open Policy Agent Serverless with AWS Lambda and Amazon API Gateway

Open Policy Agent (OPA) is an open source general-purpose policy engine, licensed under the Apache License 2.0, that allows you to decouple policy decision-making from application code. OPA assists organizations in effectively implementing policy as code. It allows policy to be expressed through a high-level declarative language (Rego), and it also allows policy authoring to be decentralized and distributed to policy owners. By codifying policy, organizations can create context-aware policies that adapt to changes in the environment or in data, allowing for advanced automation.

OPA is commonly used in cloud-native environments and ran as a service or container. Because OPA decisions are stateless, OPA is a great candidate to run in a serverless architecture for cost savings, simplicity, and performance. AWS Lambda is a serverless, event-driven compute service and the platform we will use to run OPA. We will couple Lambda with Amazon API Gateway to create a seamless experience that mirrors running OPA as a service.

Lambda and OPA are both versatile, and there are various ways you can run OPA in Lambda. We previously published blogs demonstrating how to run OPA as an executable called within Lambda, and how to import OPA libraries into Lambda code. OPA also has a blog that discusses how to run OPA as a Lambda extension.

While these methods work, they can be challenging to implement at scale as they require you to compile custom versions of OPA code for a specific purpose. They also do not take advantage of OPA server mode, requiring you to develop your own handler for decision-making. In this blog post, we will demonstrate how to run OPA as a service within a container in Lambda using just the standard precompiled OPA binary. This will allow you take advantage of OPA’s built-in REST API while still getting the performance and cost savings of Lambda with no code to customize or manage besides your actual Rego policy code.

Let’s get started!

Solution

During AWS re:Invent 2020, AWS announced the ability to run containers within Lambda. This allows users to easily move almost any system to Lambda. Containers within Lambda use the Lambda runtime interface to retrieve a Lambda innovation and provide a response back to the Lambda service. AWS publishes base images with Lambda runtimes for several popular programming languages and a runtime API is also to interact with the Lambda service through HTTP calls.

Our solution will use Lambda’s container support. We will create a custom Docker image that contains our OPA executable, Rego policy bundle, and two shell scripts that serve as our Lambda handler. Lambda will pass invocations to our handler. This will proxy requests and responses between the OPA instance running on our container and Lambda. We will add API Gateway on top to receive OPA requests from clients and pass them to Lambda function. Lambda will process the requests on our container and send the responses back to API Gateway to pass on the requesting client.

Architecture diagram showing interaction between components.

Rego policy

First, let’s create our Rego policy bundle. We will have a simple hello world policy that will give a response in various languages. The phrases will be stored in a data.json file. Here is what our files look like:

hello.rego:

package hello

hello = m {
    m := data.greetings[input.lang]
}

data.json:

{
    "greetings": {
        "en": "hello world",
        "es": "hola mundo",
        "fr": "bonjour le monde"
    }
}

Lambda handler

Next, we have our shell scripts, which are also straight forward. The first script starts OPA in server mode on our container. Then the second script starts directly if running on Lambda, or through the runtime interface emulator if running locally.

start.sh:

#!/bin/sh
#Gracefully exit if killed
exit_script() {
    echo "Shutting down..."
    trap - SIGINT SIGTERM # clear the trap
}
trap exit_script SIGINT SIGTERM
#Run OPA in sever mode and load bundle
echo "Starting Open Policy Agent"
exec /opa/opa run -s /opa/ &
#If running locally load Runtime Interface Emulator and handler, otherwise just handler
if [ -z "${AWS_LAMBDA_RUNTIME_API}" ]; then
    echo "Running Locally - Starting RIE and Handler"
    exec /usr/local/bin/aws-lambda-rie /var/runtime/bootstrap.sh
else
    echo "Running on Lambda - Starting Handler..."
    exec /var/runtime/bootstrap.sh
fi

The second shell script is our actual Lambda handler where our execution happens. The handler waits for a message from Lambda and then extracts the Lambda request ID and the event contents. These contents contain the OPA document path, HTTP method, and payload. The payload is then sent to the document path on the OPA instance on our container using the method specified. The response from OPA is then sent back to the Lambda service interface, along with the Lambda request ID. The script runs in a loop, as Lambda expects the container to run continuously.

bootstrap.sh:

#!/bin/sh

set -euo pipefail

#The handler needs to be running continuously to receive events from Lambda so we put it in a loop
while true
do
  HEADERS="$(mktemp)"
  # Grab an invocation event and write to temp file, this step will be blocked by Lambda until an event is received
  curl -sS -LD "$HEADERS" -X GET "http://${AWS_LAMBDA_RUNTIME_API}/2018-06-01/runtime/invocation/next" -o /tmp/event.data

  # Extract request ID by scraping response headers received above
  REQUEST_ID=$(grep -Fi Lambda-Runtime-Aws-Request-Id "$HEADERS" | tr -d '[:space:]' | cut -d: -f2)

  # Extract OPA variables from temp file created event and delete temp file
  OPA_PATH=$(jq -r ".x_opa_path" </tmp/event.data)
  OPA_METHOD=$(jq -r ".x_opa_method" </tmp/event.data)
  OPA_PAYLOAD=$(jq -r  ".x_opa_payload" </tmp/event.data)
  rm /tmp/event.data

  # Remove leading / in OPA path if included in request
    length=${#OPA_PATH}
    first_char=${OPA_PATH:0:1}
    [[ $first_char == "/" ]] && OPA_PATH=${OPA_PATH:1:length-1}
  echo $first_char
  echo $OPA_PATH

  # Pass Payload to OPA and Get Response
  RESPONSE=$(curl -s -X POST "http://localhost:8181/${OPA_PATH}" -d "$OPA_PAYLOAD" -H "Content-Type: application/json")

  # Send Response to Lambda
  curl -s -X POST "http://${AWS_LAMBDA_RUNTIME_API}/2018-06-01/runtime/invocation/$REQUEST_ID/response"  -d "$RESPONSE" -H "Content-Type: application/json"
done

Now that we have our code set, we will put together our Dockerfile, as shown below. We will start with the Lambda-provided Amazon Linux 2 (AL2) base image, which already had the necessary configurations and tools to interact with the Lambda runtime API and emulator. On top of the base image, we will install jq to help parse JSON, add the OPA binary, and then copy over our Rego bundle and finally copy over our shell scripts.

dockerfile:

FROM amazon/aws-lambda-provided:al2
# Update image and install JQ to parse json
RUN yum -y update
RUN yum -y install jq

# Create OPA directory, download OPA, and copy bundle
RUN mkdir /opa
WORKDIR /opa
RUN curl -L -o opa https://openpolicyagent.org/downloads/v0.37.2/opa_linux_amd64_static
RUN chmod +x opa
COPY bundle.tar.gz .

# Create runtime directory and copy shell scripts
WORKDIR /var/runtime
COPY start.sh .
COPY bootstrap.sh .
RUN chmod +x bootstrap.sh
RUN chmod +x start.sh

# Start Handler
ENTRYPOINT ["/var/runtime/start.sh"]

Build and deploy the Docker image

We are ready to build and deploy our Docker image. First, we must create a repository (repo) in Amazon Elastic Container Registry (ECR) to store our image. Follow steps 111 on the Creating a private repository page of the Amazon ECR documentation. Be sure to note your repo name.

After creating our repo, we need to configure the Docker client to use our Amazon ECR repo. Run the following command, which will use AWS Command Line Interface (CLI) to retrieve Amazon ECR credentials and then authenticate the Docker client to Amazon ECR. Be sure to replace region and aws_account_id with the appropriate values. The Docker client will remain logged in for 12 hours.

aws ecr get-login-password --region region | docker login --username AWS --password-stdin aws_account_id.dkr.ecr.region.amazonaws.com

Now let’s build our actual image and tag it so that it can be pushed to our repo. Run the following commands in the directory where you have your Dockerfile, policy, and scripts.


docker build -t repo_name
docker tag -t repo_name:latest aws_account_id.dkr.ecr.region.amazonaws.com/repo_name:latest

Finally, let’s push our image to our repo with the following command.

docker push aws_account_id.dkr.ecr.region.amazonaws.com/repo_name:latest 

Create the Lambda function

After our image is in Amazon ECR, we can easily create our Lambda function and configure it use the image by following these steps:

Create the function with the console:

  • Open the Functions page of the Lambda console.
  • Choose Create function.
  • Choose the Container image option.
  • Under Basic information, perform the following tasks:
    • For Function name, enter opa-serverless
    • For Container image URI, enter the URI of the Amazon ECR image that you created previously. You can also click Browse images to look for it and retrieve the URI.
  • Choose Create function.

Our Lambda function is now created, but there are a few more tweaks we need to make. Under Configuration for the function, we must set our max memory and timeout to appropriate values. OPA can use a lot of memory, and complex policies can take some time for a response. Let’s set our memory to 2048 MB and Timeout to 5 min 0 sec as shown in the following graphic.

Screenshot showing memory and timeout settings in Lambda console.

Our function is now ready for testing.

Under the Test tab, let’s create and send the following test events:

Test event 1:


{
  "x_opa_path": "/v0/data/hello",
  "x_opa_method": "POST"
  "x_opa_input": {
    "lang": "en"
  }
}

This should result in an OPA response of {“greeting”: hello world“}.

Test event 1:

{
  "x_opa_path": "/v0/data/hello",
  "x_opa_method": "POST"
  "x_opa_input": {
    "lang": "es"
  }
}

This should result in an OPA response of {“greeting”: “hola monde”}

If the test events work as intended, then our serverless OPA function is working!

Completing with API Gateway

When running OPA in server mode, the document path and HTTP method are taken into consideration to determine which policy to apply and action to take against the input (request body). As you can see in the test events and our code scripts, we are specifying the document path as x_opa_path, the HTTP method as x_opa_method, and the input as x_opa_input, and our shell script uses these attributes to make the appropriate call to the OPA service. While this mechanism works fine, to create a seamless experience, we need to expose a REST API endpoint that applications can call as they would a native OPA service.

We will use API Gateway’s {proxy+} feature and mapping templates to accept OPA API requests, translate them into our Lambda event format, and invoke our OPA function. API Gateway will then relay the response from Lambda back to the requestor.

To do this, we need to create a new REST API in API Gateway with ‘{proxy+}’ as our resource path and ANY as the HTTP verb. The ‘{proxy+}’ path will capture the full path and treat it as a single parameter. You can follow the tutorial in the API Gateway documentation for the steps to do this. When you get to step 7, be sure to select Lambda as the Integration type and point to the Lambda function we created.

After you create the API resource, we have two steps left. First, in the Method Execution settings for the resource, a parameter named “proxy“should have been created under Request Paths with caching enabled. We need to turn this off, as the OPA responses should not be cached. Uncheck Caching, as shown in the following graphic.

Screenshot showing Method Execution properties in API Gateway console.
Now we need to configure our Integration Request settings. By default, API Gateway set up the integration request to use the Lambda proxy, which prevents us from using mapping templates. We need to uncheck the Use Lambda Proxy Integration box. After that is done, an option for Mapping Templates should appear below. Set Request body passthrough to Never and then add a mapping template for an “application/json” content-type. Paste in the following as the mapping template:
{
"x_opa_path" : "$input.params("proxy")",
"x_opa_method": "$context.httpMethod",
"x_opa_payload": $input.body
}

We are now set and can deploy our API. Our API Gateway endpoint will now receive native OPA requests and send them to our Lambda OPA function in the right format and proxy the responses back.

We can test this out using curl as follows:


curl -X POST https://{API_GATEWAY_INVOKE_URL}/v0/data/hello
 -H 'Content-Type: application/json'
 -d '{"land":"en"}'

Depending on the language we set we should receive back an appropriate response.

Clean up

Follow the steps below to remove the resources we created on AWS as part of this blog post.

  • Delete our API in API Gateway by running the following AWS CLI Command. The API ID can be retrieved from the API Gateway console page if you do not have it.
    aws apigateway delete-rest-api --rest-api-id API_ID
  • Delete our Lambda function by running the following AWS CLI command.
    aws lambda delete-function --function-name opa-serverless
  • Delete the private repo we create in ECR by running the following AWS CLI command. The ‘force’ option will delete the repo and the images contained within it.
    aws ecr delete-repository --repository-name repo_name --force

Conclusion

We’ve reviewed how to deploy OPA and Rego policies as serverless Lambda functions with minimal effort. The scripts and Dockerfile created in this blog post can be reused and added to deployment pipelines to automate deployments of new policy. This should make it easier to deploy OPA policy that can be used by various services and functions.

Ajish Abraham

Ajish Abraham

Ajish Abraham is a Senior Product Manager for AWS Config. Ajish is passionate about security and helping customers automate and modernize controls to reduce risk without increasing user friction.