AWS Open Source Blog

Set up cross-region metrics collection for Amazon Managed Service for Prometheus workspaces

Amazon Managed Service for Prometheus is a Prometheus-compatible monitoring service for container infrastructure and application metrics that makes it easy for customers to securely monitor container environments at scale.

In a previous getting started blog post, we showed how to set up an Amazon Managed Service for Prometheus workspace and ingest metrics from an Amazon Elastic Kubernetes Service (Amazon EKS) cluster. AWS customers use more than one AWS Region in their architecture for a variety of reasons, and it is normal for customers to collect metrics from different AWS Regions and ingest them into one Amazon Managed Service for Prometheus workspace. In this article, we will show how to set up this architecture.

Architecture design

Architecture diagram for use cases where customers use more than one AWS Region.

Setup instructions

We are using many of the steps mentioned in the Getting Started with Amazon Managed Service for Prometheus article; refer to it when necessary.

We use three different AWS Regions in our example setup. We use AWS Region US-EAST-1 as Region X (where we create an Amazon EKS cluster), US-WEST-2 as Region Y (where we create an Amazon Managed Service for Prometheus workspace), and EU-WEST-1 as Region Z (where we create an Amazon Managed Grafana workspace).

Set up environment variables

Let’s set up the environment variables with necessary values.

CLUSTER_NAME=my-xregion-eks
AMP_WORKSPACE_NAME=my-xregion-prom
AWS_REGION_X=us-east-1
AWS_REGION_Y=us-west-2

Steps involved in the setup:

  • Create Amazon Managed Service for Prometheus workspace in Region Y.
  • Set up an Amazon Virtual Private Cloud (Amazon VPC) endpoint on Region Y.
  • Create an Amazon EKS cluster in Region X.
  • Set up an Amazon VPC peering connection between VPCs on Region X and Region Y.
  • Configure Amazon Route 53 to resolve requests to Amazon Managed Service for Prometheus workspace to be routed through the VPC endpoint.
  • Deploy Prometheus server on the Amazon EKS cluster and configure remote write to Amazon Managed Service for Prometheus ingestion endpoint.
  • Create Amazon Managed Grafana workspace in Region Z and query metrics from Amazon Managed Service for Prometheus workspace in Region Y.

Create an Amazon Managed Service for Prometheus workspace in Region Y

We can use the following commands to create an Amazon Managed Service for Prometheus workspace:

aws amp create-workspace --alias ${AMP_WORKSPACE_NAME} --region ${AWS_REGION_Y}

Then, wait for a few seconds and execute the following command to check the status of the workspace created:

aws amp list-workspaces --region ${AWS_REGION_Y}

You should get output similar to the one below. Ensure that the status is ACTIVE, indicating that the workspace was created successfully.

{
    "workspaces": [
        {
            "alias": "my-xregion-prom",
            "arn": "arn:aws:aps:us-west-2:1234567890:workspace/ws-9876ww00-xx87-4f26-94c7-94237e12a4e9",
            "createdAt": "2021-01-12T13:01:18.309000-06:00",
            "status": {
                "statusCode": "ACTIVE"
            },
            "workspaceId": "ws-9876ww00-xx87-4f26-94c7-94237e12a4e9"
        }
    ]
}

Alternatively, you can create a workspace using the AWS console by simply providing the workspace name and selecting Create as shown in the following image.

Amazon Managed Service for Prometheus homepage.

Set up a VPC endpoint on Region Y

  • Go to Region Y and navigate to the VPC endpoint page. Then choose Create Endpoint.
  • Select AWS services in the Create Endpoint screen.
  • Fill in the Service Name text box with com.amazonaws<Region Y>aps-workspaces, and select the resulting service as shown in the following screenshot.

Screenshot of page where you create endpoints.

  • Select a VPC that you want to use for this purpose, select the subnets and the default security group, and choose Create endpoint.
  • Now, we have a VPC endpoint created that we can use to make calls to the Amazon Managed Service for Prometheus service from the VPC.

Create an Amazon EKS cluster in Region X

Now we create an Amazon EKS cluster in Region X. The easiest way to create a cluster on EKS is to use eksctl. Once you have eksctl installed on your local machine, you can execute the following command to create the cluster:

eksctl create cluster ${CLUSTER_NAME} --region ${AWS_REGION_Y}

Once the cluster is ready, we deploy the Prometheus server on the cluster. Before that, however, we need to set up the required permissions so that the Prometheus server can write into an Amazon Managed Service for Prometheus workspace.

The following shell script can be used to execute these actions on the my-xregion-eks Amazon EKS cluster:

  1. Create an AWS Identity and Access Management (IAM) role with an IAM policy that has permissions to remote-write into an Amazon Managed Service for Prometheus workspace.
  2. Create a Kubernetes service account that is annotated with the IAM role.
  3. Create a trust relationship between the IAM role and the OpenID Connect (OIDC) provider hosted in your Amazon EKS cluster.
#!/bin/bash -e

SERVICE_ACCOUNT_AMP_NAMESPACE=prometheus
AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text)
OIDC_PROVIDER=$(aws eks describe-cluster --name ${CLUSTER_NAME} --query "cluster.identity.oidc.issuer" --output text | sed -e "s/^https:\/\///")
SERVICE_ACCOUNT_AMP_INGEST_NAME=amp-iamproxy-service-account
SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE=amp-iamproxy-ingest-role
SERVICE_ACCOUNT_IAM_AMP_INGEST_POLICY=AMPIngestPolicy
#
# Set up a trust policy designed for a specific combination of K8s service account and namespace to sign in from a Kubernetes cluster which hosts the OIDC Idp.
#
cat <<EOF > TrustPolicy.json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "${OIDC_PROVIDER}:sub": "system:serviceaccount:${SERVICE_ACCOUNT_AMP_NAMESPACE}:${SERVICE_ACCOUNT_AMP_INGEST_NAME}"
        }
      }
    }
  ]
}
EOF

function getRoleArn() {
  OUTPUT=$(aws iam get-role --role-name $1 --query 'Role.Arn' --output text 2>&1)
  # Check for an expected exception
  if [[ $? -eq 0 ]]; then
    echo $OUTPUT
  elif [[ -n $(grep "NoSuchEntity" <<< $OUTPUT) ]]; then
    echo ""
  else
    >&2 echo $OUTPUT
    return 1
  fi
}
#
# Create the IAM Role for ingest with the above trust policy
#
SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE_ARN=$(getRoleArn ${SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE})
if [ "$SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE_ARN" = "" ]; 
then
  #
  # Create the IAM role for service account
  #
  SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE_ARN=$(aws iam create-role \
  --role-name ${SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE} \
  --assume-role-policy-document file://TrustPolicy.json \
  --query "Role.Arn" --output text)
  
  #
  # Attach the required IAM policies to the IAM role created above
  #
  SERVICE_ACCOUNT_IAM_AMP_INGEST_ARN=arn:aws:iam::aws:policy/AmazonPrometheusRemoteWriteAccess
  
  aws iam attach-role-policy \
  --role-name ${SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE} \
  --policy-arn ${SERVICE_ACCOUNT_IAM_AMP_INGEST_ARN} 
else
    echo "$SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE_ARN IAM role for ingest already exists"
fi
echo ${SERVICE_ACCOUNT_IAM_AMP_INGEST_ROLE_ARN}
#
# EKS cluster hosts an OIDC provider with a public discovery endpoint.
# Associate this IdP with AWS IAM so that the latter can validate and accept the OIDC tokens issued by Kubernetes to service accounts.
# Doing this with eksctl is the easier and best approach.
#
eksctl utils associate-iam-oidc-provider --cluster ${CLUSTER_NAME} --approve

Set up a VPC Peering Connection between VPCs on Region X and Region Y

We need to set up a VPC peering connection between the two VPCs across regions so that calls to the VPC endpoint from Region X can reach Region Y.

    • Navigate to the Create Peering Connection screen on the VPC console on Region X (the requester).
    • In the VPC requester drop-down, select the VPC of the EKS cluster created earlier.
    • Under Select another VPC to peer with section, select My Account, select Another Region, and then select Region Y in the drop-down menu.
    • In the VPC ID(Acceptor) text box, enter the VPC ID of the VPC in Region Y.
    • Your resulting screen should look similar to the following screenshot:

Screenshot of the page titled "Create Peering Connection".

  • Now choose Create Peering Connection.
  • Your peering connection will now go to Pending Acceptance status. This is because, although the request VPC has made the request to connect to another VPC, the connection only gets created if the VPC on the other end accepts the connection request.
  • Now, navigate to the VPC Peering Connection screen on Region Y and select the Peering request that is in Pending Acceptance status and accept using the Actions drop-down. This will change the status to Active.

Configure route table on the VPC to connect to the peering connection

  • Go to the VPC console on Region X (where your EKS cluster is) and select the Public Route Table that is associated to the VPC.
  • Under the Routes tab, choose Edit routes.
  • Enter the Region Y VPC CIDR range in the Destination text box and select the newly created peering connection as the Target.
  • Choose Save routes. The configuration should look similar to the screenshot:

Screenshot of results from selecting "Save routes".

Configure route table on the receiving VPC (on Region Y) to connect to the peering connection

  • Go to the VPC console on Region Y and select the Public Route Table that is associated to the VPC.
  • Under the Routes tab, choose Edit routes.
  • Enter the Region X VPC CIDR range in the Destination text box and select the newly created peering connection as the Target.
  • Choose Save routes. The configuration should look similar to the following screenshot:

Screenshot of the configuration once the routes are edited.

Set up the security group in Region Y to allow requests from resources in the VPC in Region X

To allow the traffic from Region X to be accepted into Region Y, add the VPC CIDR range of the EKS cluster in Region X. Once added, your security group Inbound rules should look like the following screenshot:

Screenshot of the security group Inbound rules.

Configure Route 53 to resolve requests to Amazon Managed Service for Prometheus workspace to be routed through the VPC endpoint

  • Go to the Route53 console and choose Create hosted zone.
  • In the domain name field, enter the information for the domain name that you want to route traffic for.
  • Select Private hosted zone.

Screenshot of the page to create a Private hosted zone.

  • Choose Create hosted zone.
  • Now we need to create an A record to route the traffic to the VPC endpoint created earlier.
  • Inside the newly created hosted zone, choose Create record.
  • In the Quick create record screen, choose Switch to wizard.
  • In the Choose routing policy screen, select Simple routing and choose Next.
  • In the Configure records screen, select Define simple record.
  • In the new screen, leave the Record name field as it is.
  • Select Alias to VPC endpoint in the Value/Route traffic to drop-down.
  • Select Region Y where you created the VPC endpoint earlier.
  • Now, select the first VPC Endpoint alias from the lookup that appears.
  • Leave the Record type drop-down as it is and select Define simple record.
  • Once created, your Hosted zone should look like the following screenshot:

Screenshot of the example Hosted zone.

Deploy Prometheus server

We will be using Helm to install the Prometheus server on the cluster. The following commands will add the helm repo, create a new namespace called prometheus, and deploy Prometheus using the Helm chart prometheus-community/prometheus.

kubectl create ns prometheus
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add kube-state-metrics https://kubernetes.github.io/kube-state-metrics
helm repo update

Next, we create a file called amp_ingest_override_values.yaml by running the following:

cat >> amp_ingest_override_values.yaml << EOF

## The following is a set of default values for prometheus server helm chart which enable remoteWrite to AMP
## For the rest of prometheus helm chart values see: https://github.com/prometheus-community/helm-charts/blob/main/charts/prometheus/values.yaml
##
serviceAccounts:
  ## Disable alert manager roles
  ##
  server:
        name: "amp-iamproxy-service-account"
  alertmanager:
    create: false

  ## Disable pushgateway
  ##
  pushgateway:
    create: false

server:
  remoteWrite:
    -
      queue_config:
        max_samples_per_send: 1000
        max_shards: 200
        capacity: 2500

  ## Use a statefulset instead of a deployment for resiliency
  ##
  statefulSet:
    enabled: true

  ## Store blocks locally for short time period only
  ##
  retention: 1h
  
## Disable alert manager
##
alertmanager:
  enabled: false

## Disable pushgateway
##
pushgateway:
  enabled: false

EOF 

Execute the following command to modify the Prometheus server configuration to deploy the signing proxy and configure the remoteWrite endpoint:

IAM_PROXY_PROMETHEUS_ROLE_ARN=$(aws iam get-role --role-name amp-iamproxy-ingest-role | jq .Role.Arn -r)
WORKSPACE_ID=$(aws amp list-workspaces --alias ${AMP_WORKSPACE_NAME}| jq .workspaces[0].workspaceId -r)
helm install amp-prometheus-chart prometheus-community/prometheus -n prometheus -f ./amp_ingest_override_values.yaml \
--set serviceAccounts.server.annotations."eks\.amazonaws\.com/role-arn"="${IAM_PROXY_PROMETHEUS_ROLE_ARN}" \
--set server.remoteWrite[0].url="https://aps-workspaces.${AWS_REGION_Y}.amazonaws.com/workspaces/${WORKSPACE_ID}/api/v1/remote_write (https://aps-workspaces.${aws_region}.amazonaws.com/workspaces/$%7BWORKSPACE_ID%7D/api/v1/remote_write)" \
--set server.remoteWrite[0].sigv4.region=${AWS_REGION_Y}

Create Amazon Managed Grafana workspace in Region Z and query metrics from Amazon Managed Service for Prometheus workspace in Region Y

  • Set up an Amazon Managed Grafana workspace by following the instructions from the blog post Amazon Managed Grafana – Getting Started from the AWS Management & Governance Blog.
  • Once you’re logged into the Amazon Managed Grafana console, add the Amazon Managed Service for Prometheus datasource by selecting AWS services under the AWS section on the left navigation bar.
  • Select Prometheus under the AWS services tab.
  • In the Data sources tab, select your AWS Region (Region Y) where the Amazon Managed Service for Prometheus workspace is.
  • The Amazon Managed Service for Prometheus workspace will automatically appear under the drop-down. Select the check box and choose Add 1 data source to add the Amazon Managed Service for Prometheus data source.

Screenshot of the data sources.

  • Now choose Explore from the left navigation bar and enter the following query in to the text box: apiserver_current_inflight_requests
  • You will see a screen similar to the one in the following screenshot, which shows that we are able to successfully query metrics from the EKS cluster through the Amazon Managed Service for Prometheus workspace:

Screenshot of the output when you run the query.

Conclusion

In this article, we walked through the steps to securely ingest Prometheus metrics into an Amazon Managed Service for Prometheus workspace from an Amazon EKS cluster and also query the metrics from an Amazon Managed Grafana workspace, all deployed on different AWS Regions.

Although we used the Prometheus server to ingest metrics into Amazon Managed Service for Prometheus, we can alternatively use the newly launched lightweight Grafana Cloud Agent for this purpose. Check out the GitHub repo for further details. We can use the AWS Distro for Open Telemetry Remote Write Exporter to send application metrics to Amazon Managed Service for Prometheus as well. Learn more about the topic in the documentation.

References

Imaya Kumar Jagannathan

Imaya Kumar Jagannathan

Imaya Kumar Jagannathan is a Principal Solution Architect focused on AWS Observability services including Amazon CloudWatch, AWS X-Ray, Amazon Managed Service for Prometheus, Amazon Managed Grafana and AWS Distro for Open Telemetry. He is passionate about monitoring and observability and has a strong application development and architecture background. He likes working on distributed systems and is excited to talk about microservice architecture design. He loves programming in C#, working with containers and serverless technologies. LinkedIn: /imaya.