Monitor Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL performance using PGSnapper

Collecting database performance metrics is critical for troubleshooting any performance issues. AWS offers several services to gather and analyze database performance metrics, including Amazon Relational Database Service (Amazon RDS) Enhanced Monitoring, Amazon CloudWatch, and Amazon RDS Performance Insights. While these services provide detailed insights on how your database is performing, sometimes you need a different analysis method to troubleshoot a specific database performance problem. Additionally, many database administrators like to use command line tools over a graphical user interface (GUI) to analyze database performance quickly and repeatedly.

Periodic metrics analysis is an analysis technique where you review the same set of metrics collected at consistent intervals over a period of time. This technique is helpful for comparing the behaviors of a database over an interval of time to see if there are opportunities to improve or maintain performance as the workload changes.

PGSnapper is an open-source tool developed by AWS Database Partner Solutions Architects that can assist with periodic collection (snapping) of Amazon RDS for PostgreSQL or Amazon Aurora PostgreSQL-Compatible Edition performance-related statistics and metrics. In this post, we review how PGSnapper works and its use cases. We then walk through how to set up PGSnapper using AWS CloudFormation, load PGSnapper-generated output to a PostgreSQL database, and analyze the data using provided sample SQL queries to derive insights.

Solution overview

The following diagram shows the architecture that you set up in this walkthrough.

PGSnapper is written in Python. It uses a JSON config file that you can customize to include database dictionary views and queries, whose output you want to capture periodically.

PGSnapper runs on an Amazon Elastic Compute Cloud (Amazon EC2) instance with access to the PostgreSQL instance to be monitored. It retrieves database credentials stored in AWS Secrets Manager. To minimize impact on the database, each PGSnapper run exports the output of the queries being monitored to a separate CSV-formatted file on an EC2 instance. When you’re ready to analyze the output, you need to run a one-time packaging to collect other database stats and generate Data Definition Language (DDL) commands for creating the staging tables used for import. You can load the output generated by PGSnapper to any PostgreSQL database (for example, dev) using the Loader script included in the tool and perform analysis using the provided sample queries. You can also write your own custom SQL queries to analyze the output however you want.

The PGSnapper setup also creates an Amazon Simple Storage Service (Amazon S3) staging bucket that you can use for storing and sharing the PGSnapper-generated output with other teams.

PGSnapper use cases

After PGSnapper is set up and scheduled to run periodically, you can use it for several different use cases, such as the following:

Analyze database performance following a load test
Dig deeper into database metrics to find the top resource-intensive queries if your database has a performance issue during a particular hour of the day
Collect and share database performance metrics for your production database with another team, such as your performance engineering team, without giving them direct access to production

We look at some examples of these use cases later in this post.

Prerequisites

Complete the following prerequisite steps before setting up PGSnapper. Note that some of the DB parameters updated in this section are static and require an instance reboot to take effect.

Make sure you have a PostgreSQL instance to use for this walkthrough. If not, you can create an RDS for PostgreSQL instance or Aurora PostgreSQL-Compatible cluster.
If your PostgreSQL instance is hosted in a private subnet, ensure that an internet gateway is attached to the VPC and a NAT gateway has been created in the public subnet. This will allow the PGSnapper EC2 instance, deployed in the private subnet, to download the necessary packages from internet during the bootstrapping process. You can follow AWS documentation to deploy a NAT gateway.
When you create a new RDS for PostgreSQL DB instance or Aurora PostgreSQL DB cluster, it comes with default parameter groups, which can’t be updated. For RDS for PostgreSQL, create a custom DB parameter group and associate it with the RDS instance. For Aurora PostgreSQL, create a custom cluster parameter group along with a custom DB parameter group. Associate the cluster parameter group with the Aurora cluster and the DB parameter group with the primary DB instance and the Aurora replicas.
Modify the shared_preload_libraries DB parameter and add pg_stat_statements extension if not already there. You can set this up in the DB parameter group for RDS for PostgreSQL and the cluster parameter group for Aurora PostgreSQL.
Modify the track_functions parameter and set to all to track procedural language, SQL, and C language functions. You can set this up in the DB parameter group for RDS for PostgreSQL and the cluster parameter group for Aurora PostgreSQL.
Set the track_activity_query_size parameter to 102400 to capture the full text of very long SQL statements. You can set this up in the DB parameter group for RDS for PostgreSQL and the cluster parameter group for Aurora PostgreSQL.
Verify and save the parameter updates.

The following image shows parameters that you are modifying as per prerequisites.

Reboot the database instance. The shared_preload_libraries and track_activity_query_size parameters are static and require an instance reboot for them to take effect.
Create a database user which PGSnapper can use to connect to the PostgreSQL instance for collecting database metrics. You can use psql or any PostgreSQL GUI client such as pgAdmin for running the command below after updating the placeholders in <>. For this post, we use psql to create a database user named pgsnapper as a member of pg_monitor role, after logging in as the master user. Enter the master user password when prompted.

/usr/local/pgsql/bin/psql --host=<RDS for PostgreSQL Endpoint / Aurora PostgreSQL cluster Endpoint> --port=<PostgreSQL Instance port e.g., 5432> --username=<RDS for PostgreSQL/ Aurora PostgreSQL cluster master username> --dbname=postgres

postgres=> create user pgsnapper password '<password>' in role pg_monitor;

Set up PGSnapper using AWS CloudFormation

The provided CloudFormation template completes the following setup steps in your AWS account:

Stores the database user password in a Secrets Manager secret, which PGSnapper uses to connect to the PostgreSQL instance
Creates an EC2 instance with the latest Amazon Linux 2 AMI and deploys it in the same VPC and subnet as the PostgreSQL database instance
Bootstraps the EC2 instance by installing the AWS Systems Manager agent, PostgreSQL client, required Python packages, and staging PGSnapper scripts
Creates an S3 bucket for storing and sharing PGSnapper-generated output
Adds the security group for the EC2 instance to the security group assigned to the PostgreSQL instance for inbound network access.

Choose Launch Stack to deploy the CloudFormation stack in your AWS account in the Region where the PostgreSQL instance to be monitored is running.

Enter the stack parameters as shown in the following table.

Parameter	Description
VPCID	VPC ID of PostgreSQL database instance to be monitored (for example, `vpc-0343606e`).
SubnetID	VPC subnet ID of the PostgreSQL database instance to be monitored (for example, `subnet-a0246dcd`).
DBSecurityGroupID	Security group ID of the PostgreSQL database instance to be monitored (for example, `sg-8c14mg64`).
InstanceType	PGSnapper EC2 instance type. Leave the default value.
EBSVolSize	PGSnapper EC2 instance EBS Volume Size in GiB. Leave the default value.
DBUsername	Database username for the PostgreSQL instance to be monitored. For this post, specify the `pgsnapper` user which you created in the prerequisites section.
DBUserPassword	Database user password for the PostgreSQL instance to be monitored. For this post, enter the `pgsnapper` user password which you specified in the prerequisites section.
DBPort	Port for the PostgreSQL instance to be monitored.

Select the acknowledgement check box and choose Create stack. Stack creation takes about 7 minutes to complete.
When it’s complete, choose the Outputs tab of the stack and note the resources that you need to configure PGSnapper. Following image shows the output tab, you can find the key resources and corresponding values here.

On the Amazon EC2 console, select the PGSnapper instance by referring to the value of the PGSnapperEC2InstID CloudFormation output key and choose Connect. On the Session Manager tab, choose Connect again.
Because Session Manager uses a user named ssm-user to connect to an EC2 instance by default, change to the user ec2-user by running the following command:

    sudo su -l ec2-user

Load the pg_stat_statements extension into the PostgreSQL database where application-related objects are stored and which needs to be monitored. For this post, you use psql, which is already installed on the PGSnapper EC2 instance, to log in as the master user and create a new database named testapp. Once the database is created, load the extension into it. Enter the master user password when prompted.
```
/usr/local/pgsql/bin/psql --host=<RDS for PostgreSQL Endpoint / Aurora PostgreSQL cluster Endpoint> --port=<PostgreSQL Instance port e.g., 5432> --username=<RDS for PostgreSQL / Aurora PostgreSQL cluster master username> --dbname=postgres

postgres=> create database testapp;
postgres=> \c testapp
testapp => create extension pg_stat_statements;
testapp => \q
```

Review PGSnapper usage by running the following command:

/home/ec2-user/scripts/pg_perf_stat_snapper.py -h

Save the PostgreSQL instance-related settings in environment variables for easy retrieval:

export PGHOST=<RDS for PostgreSQL Endpoint / Aurora PostgreSQL cluster Endpoint>
export PGPORT=<PostgreSQL Instance port e.g., 5432>
export PGDATABASE=<Database name where application objects are stored. For this post, specify testapp database created earlier>
export PGUSER=<RDS for PostgreSQL / Aurora PostgreSQL cluster database username. For this post, specify pgsnapper user created earlier>
export PGUSER_ADMIN=<RDS for PostgreSQL / Aurora PostgreSQL cluster master username>
export PGSnapperSecretARN=<AWS Secrets Manager ARN. Refer CloudFormation Output Key: PGSnapperSecretARN>
export AWSRegion=<AWS region where you installed the CloudFormation stack e.g., us-west-2>

Run PGSnapper manually one time using the following command and review the log file generated under the /home/ec2-user/scripts/log/$PGHOST/$PGDATABASE/ directory. By default, all the output is stored under the /home/ec2-user/scripts/output/$PGHOST/$PGDATABASE/ directory. If you’re specifying the output directory using the -o argument, the path needs to be specified as an absolute path, for example /home/ec2-user/mysnapperoutput.

/home/ec2-user/scripts/pg_perf_stat_snapper.py -e $PGHOST -P $PGPORT -d $PGDATABASE -u $PGUSER -s $PGSnapperSecretARN -m snap -r $AWSRegion

cat /home/ec2-user/scripts/log/$PGHOST/$PGDATABASE/pg_perf_stat_snapper.log

Set up a crontab schedule in the EC2 instance to run PGSnapper every minute.

crontab -e

Press i to enter insert mode. Copy and paste the following after updating the placeholders in <>. Press Esc followed by :wq! to save & exit.

*/1 * * * * /home/ec2-user/scripts/pg_perf_stat_snapper.py -e <RDS for PostgreSQL Endpoint / Aurora PostgreSQL cluster Endpoint> -P <PostgreSQL Instance port e.g., 5432> -d <Database name where application objects are stored. For this post, specify testapp database created earlier> -u <RDS for PostgreSQL / Aurora PostgreSQL cluster database username. For this post, specify pgsnapper user created earlier> -s <AWS Secrets Manager ARN. Refer CloudFormation Output Key: PGSnapperSecretARN> -m snap -r <AWS region where you installed the CloudFormation stack e.g., us-west-2> &>/dev/null

Verify crontab by entering the command below.

crontab -l

Load test

At this point, PGSnapper is set up and scheduled to run every minute. Next, you use pgbench, which is already installed on the PGSnapper EC2 instance, to generate some load on the testapp database that you created earlier.

Run the following command to initialize the pgbench load test. Enter the master user password when prompted

/usr/local/pgsql/bin/pgbench -i --host=$PGHOST --username=$PGUSER_ADMIN --port=$PGPORT testapp

Run the following command to generate the pgbench load test. Enter the master user password when prompted. This step takes about 10 minute to complete.

/usr/local/pgsql/bin/pgbench --host=$PGHOST --username=$PGUSER_ADMIN --port=$PGPORT --client=10 --jobs=2 --transactions=10000 testapp

When the load test is complete, edit crontab and comment out PGSnapper schedule runs by adding # in front of the job. Press Esc followed by :wq! to save & exit.

crontab -e

Package the output

Now, you are ready to package the PGSnapper-generated output files and import them into a PostgreSQL database for performing analysis. PGSnapper packaging exports other database dictionary views and queries output mentioned in its config file, which are only required to be snapped once. It also generates a DDL command file named all_ddls.sql, which the Loader script described in the next section, uses to create staging tables and import the output generated by PGSnapper.

Package the PGSnapper output by running the following command:

/home/ec2-user/scripts/pg_perf_stat_snapper.py -e $PGHOST -P $PGPORT -d $PGDATABASE -u $PGUSER -s $PGSnapperSecretARN -m package -r $AWSRegion

Verify that packaging was successful by viewing the PGSnapper logfile.

tail -30 /home/ec2-user/scripts/log/$PGHOST/$PGDATABASE/pg_perf_stat_snapper.log

PostgreSQL performance stats Loader script

Now you are ready to load PGSnapper-generated output into any PostgreSQL database to perform your analysis. As a best practice for production workloads, use a separate PostgreSQL instance to load PGSnapper-generated output. For instructions on configuring another instance, refer to Setup in the GitHub repo.

For this post, you use the same PostgreSQL instance that you used earlier to generate the PGSnapper output.

Create a new database to load PGSnapper-generated output. For this post, you create a new database named testappsnaps by logging in to the PostgreSQL instance as the master user. Enter the master user password when prompted.

/usr/local/pgsql/bin/psql --host=$PGHOST --port=$PGPORT --username=$PGUSER_ADMIN --dbname=postgres

postgres=> create database testappsnaps;
testapp => \q

Import the PGSnapper-generated output by running the following command. For the -o argument, provide the absolute path for the directory under which all the PGSnapper-generated CSV output files, including all_ddls.sql file, are stored. By default, this path is /home/ec2-user/scripts/output/$PGHOST/$PGDATABASE/.

/home/ec2-user/scripts/pg_perf_stat_loader.py -e $PGHOST -P $PGPORT -d <Database name where PGSnapper-generated output will be imported. For this post, specify testappsnaps database created earlier> -u $PGUSER -s $PGSnapperSecretARN -o <Staged PGSnapper output directory. For this post, specify /home/ec2-user/scripts/output/$PGHOST/$PGDATABASE/> -r $AWSRegion

Verify that the data loading was successful by viewing the loader script logfile /home/ec2-user/scripts/log/pg_perf_stat_loader.log.

cat /home/ec2-user/scripts/log/pg_perf_stat_loader.log

Download sample queries for PGSnapper-generated output analysis

Sample queries for analyzing PGSnapper-generated output are available in GitHub. Download the SQL files to a machine where psql is installed, and which has access to the PostgreSQL instance where you imported the PGSnapper-generated output. For this post, you can use the PGSnapper EC2 instance.

Download the sample SQL files by running the following commands:

cd /home/ec2-user/scripts

svn checkout "https://github.com/aws-samples/aurora-and-database-migration-labs/trunk/Code/PGPerfStatsSnapper/SQLs"

Run SQL queries for PGSnapper-generated output analysis

PGSnapper includes sample queries for various PostgreSQL instance and database-related stats, session stats, SQL stats, table stats, and index stats. For this post, you run a subset of these queries to view database and schema sizes, database related statistics and top 20 SQLs by elapsed time.

Log in to the testappsnaps database where you loaded the PGSnapper-generated output. Enter the pgsnapper database user password when prompted.

cd /home/ec2-user/scripts/SQLs

/usr/local/pgsql/bin/psql --host=$PGHOST --port=$PGPORT --username=$PGUSER --dbname=testappsnaps

Run snappermenu.sql to find all the sample queries. The following screenshot shows only a subset of all the queries.

\i snappermenu.sql

List all the snapshots available with sample start and end times, that PGSnapper collected by running the following SQL file:

=> \i list_snaps.sql

Set the begin and end snapshot ID for analysis before running any of the analysis queries:

=> \i set_snaps.sql

Run the following SQL file to see database and schema sizes:

=> \i db_and_schema_sizes.sql

Run the following SQL file to see database-related stats:

=> \i db_stats.sql

Run the following SQL file to list top 20 queries by elapsed time during the snapshot IDs you set earlier. Depending on your database version, snappermenu.sql will show the appropriate version (top_20_sqls_by_elapsed_time_v2.sql or top_20_sqls_by_elapsed_time.sql) of the SQL file.

=> \i top_20_sqls_by_elapsed_time_v2.sql

Run the following SQL file and enter the query ID of interest to see detailed stats of that query for each snapshot. Depending on your database version, snappermenu.sql will show the appropriate version (sql_stat_history_v2.sql or sql_stat_history.sql) of the SQL file.

=> \i sql_stat_history_v2.sql

Clean up

To uninstall PGSnapper and delete related AWS resources, complete the following steps:

If you previously saved any data in the S3 bucket created by the PGSnapper CloudFormation stack (CloudFormation output key: PGSnapperS3Bucket), go to the Amazon S3 console and empty that S3 bucket.
On the AWS CloudFormation console, delete the PGSnapper CloudFormation stack.

Summary

In this post, we covered how you can set up and use PGSnapper to periodically collect PostgreSQL database metrics and run customized queries to identify performance bottlenecks. Because you can load the PGSnapper-generated output to any PostgreSQL database, it doesn’t impact your production database while performing analysis and has very limited impact during metrics collection.

We encourage you to explore the other sample SQL files for analysis included in PGSnapper, beyond what we covered in this post. If you’re doing a load test, DB benchmarking, adding a new application service, scaling up your TPS/connection, or looking to identify resource-intensive queries during a particular time window, consider using PGSnapper.

If you have comments or questions, leave a comment. To submit bug fixes or request enhancements, open a pull request or an issue in the PGSnapper GitHub repository

About the Authors

Arabinda Pani is a principal partner solutions architect specializing in databases at AWS, brings years of experience from the field in managing, migrating, and innovating databases and data platforms at scale. In his role, Arabinda works with AWS Partners to provide guidance and technical assistance on database projects, helping them improve the value of their solutions when using AWS.

Abhinav Sarin is a senior partner solutions architect at AWS. His core interests include databases, data analytics, and machine learning. He works with AWS customers and partners to provide guidance and technical assistance on database projects, helping them improve the value of their solutions when using AWS.