AWS Open Source Blog

Compliance auditing with Steampipe and SQL

This post was contributed by David Boeke, Bob Tordella, Jon Udell, and Nathan Wallace.

Steampipe is an open source tool under the AGPLv3 license for querying cloud APIs in a universal way and reasoning about the data in SQL. In our first post we explored the AWS plugin which maps the suite of AWS APIs to (currently) 274 Postgres tables. You can use these to query individual AWS services, join across them (spanning regions and accounts), and join AWS data with queries against many 3rd-party APIs.

In this post we’ll explore how Steampipe’s Compliance “mod” uses that query infrastructure to check for compliance with regulatory frameworks. It provides 448 controls and 233 named queries that deliver comprehensive support for these standards:

But first, let’s review some Steampipe basics.

Running a query

Here’s a typical query you can run in the Steampipe console, after installing the AWS plugin (steampipe install plugin aws).

select
 title,
 create_date,
 mfa_enabled
from
 aws_iam_user
+-----------------+---------------------+-------------+
| title           | create_date         | mfa_enabled |
+-----------------+---------------------+-------------+
| mary_major      | 2005-03-24 21:30:00 | false       |
| carlos_salazar  | 2005-03-24 21:30:00 | true        |
| richard_roe     | 2005-03-24 21:30:00 | false       |
| mateo_jackson   | 2005-03-24 21:30:00 | false       |
| john_stiles     | 2005-03-24 21:30:00 | true        |
+-----------------+---------------------+-------------+

As we discussed last time, Steampipe parallelizes the query in three ways.

  • Across sub-APIs. In this case a primary API call gets title and create_date, and a secondary API call gets mfa_enabled. The two calls happen in parallel.
  • Across regions. When your account is configured for multiple regions, Steampipe queries across them in parallel.
  • Across accounts. You can bundle a set of AWS accounts and query across those in parallel.

All these mechanisms come into play when you install and use the Compliance mod.

Running a suite of benchmarks

To get started, clone the repo and navigate to its local directory.

git clone https://github.com/turbot/steampipe-mod-aws-compliance.git

cd steampipe-mod-aws-compliance

That directory is a Steampipe workspace. It contains resources – named queries, controls, benchmarks – that Steampipe discovers and then uses to check compliance with frameworks.

To check compliance with CIS 1.4, run this command.

steampipe check aws_compliance.benchmark.cis_v140
Steampipe control output in terminal

To send the output to HTML:

steampipe check aws_compliance.benchmark.cis_v140 --export=cis_v140.html

Alternatively you can use --export=cis_v140.md for Markdown.

Here’s the top-level summary in the HTML report.

Steampipe control output as HTML

Finally, you can use --export=cis_v140.csv or --export=cis_v140.json to capture the results as raw data.

Anatomy of a Steampipe compliance benchmark

The steampipe-mod-aws-compliance repo contains a subdirectory for each framework: cs_v140, foundational_security, etc. Each of these contains a set of .sp (Steampipe) files that define benchmarks and controls. A benchmark defines a thematic set of controls. Each control runs a query that checks for a condition like, for example, MFA is enabled for the ‘root’ user account”.

In the cis_v140 directory, the file cis.sp enumerates benchmarks that are defined in the files section_1.sp, section_2.sp, etc.

Each of these files in turn defines a set of controls, like so.

benchmark "cis_v140_1" {
  title = "1 Identity and Access Management"
  childern = [
    control.cs_140_1_1,
    ...
    control.cs_140_1_21
 ]
}

Here are the key elements of one of those controls.

control "cis_v140_1_5" {
  title  = "1.5 Ensure MFA is enabled for the 'root' user account"
  sql    = query.iam_root_user_mfa_enabled.sql
  ...
}

Controls can include SQL queries directly, but often (as in this case) they refer to named queries that live in the query subdirectory of the mod’s repo. The Compliance mod defines 233 of these named queries, organized in subdirectories by AWS service. Here’s query/iam/iam_root_user_mfa_enabled.sql.

select
 -- Required Columns
 'arn:' || partition || ':::' || account_id as resource,
 case
 when account_mfa_enabled then 'ok'
 else 'alarm'
 end status,
 case
 when account_mfa_enabled then 'MFA enabled for root account.'
 else 'MFA not enabled for root account.'
 end reason,
 -- Additional Dimensions
 account_id
from
 aws_iam_account_summary;

You can run this query directly in the Steampipe CLI. Type query. and then autocomplete like so.

Steampipe named query completion

The pattern is a simple one: query some aspect of AWS infrastructure, check for a specific condition, and report ok or alarm. Queries that conform to this pattern are essentially tests, and the test runner is the command steampipe check.

Often these checks apply across compliance frameworks. For example, iam_root_user_mfa_enabled is used not only by cis_v140 but also by audit_manager_control_tower, gdpr, hipaa, nist_800, and nist_csf.

Compliance checks at scale

The Compliance mod comprises 11 frameworks. Each of these defines multiple benchmarks, which in turn define multiple controls that refer to named queries. The numbers keep growing but as of this writing there are 448 controls and 233 named queries. You can run all of them like so.

steampipe check all

Or run a single benchmark.

steampipe check benchmark.nist_csf_pr_ac_5   

Or run an individual control.

steampipe check control.cloudtrail_trail_logs_encrypted_with_kms_cmk

When running one or more benchmarks (and thus many controls) an additional level of parallelism comes into play. Steampipe will run up to 5 controls in parallel. When controls need the same query results, as is typical, the first control to run a query caches the results for others that follow.

Writing your own control

In Find secrets everywhere we showed how the code_secret table provided by the code plugin can search any column of any table for secrets.

When you write a control, you can leverage all your installed Steampipe plugins. Here’s a control that joins the aws plugin’s aws_ec2_instance table with the code plugin’s code_secret table to look for secrets in ec2 user data.

control "no_secrets_in_ec2_user_data" {
 title = "ec2 user data contains no secrets"
 sql = <<EOT
 -- capture all arns to anchor the left join
 with instances as (
 select
 arn,
 title
 from
 aws_ec2_instance
 ),
 -- find zero or more arns with secrets
 secrets as (
 select
 i.arn,
 count(*)
 from
 code_secret cs,
 aws_ec2_instance i
 where
 cs.src = i.user_data
 group by
 i.arn
 )
 -- report arns with and without secrets
 select
 i.arn as resource,
 case 
 when ( select ( select count(*) from secrets s where s.arn = i.arn ) = 0 ) then 'ok'
 else 'alarm'
 end as status,
 case
 when ( select ( select count(*) from secrets s where s.arn = i.arn ) = 0 )
 then arn || ': no secrets found in user data'
 else arn || ': secrets found in user data'
 end as reason,
 i.title
 from
 instances i
 left join
 secrets s
 using
 (arn)

 EOT
}

If you put that code into a .sp file in your workspace, you can run the test like so.

steampipe check control.no_secrets_in_ec2_user_data --export no_secrets.html

Here’s the HTML report.

HTML report results from Steampipe

Conclusion

As operators of AWS infrastructure we have to comply with security and governance frameworks. Steampipe’s AWS plugin provides the data to support compliance checks. The Compliance mod defines hundreds of controls that use that data to check compliance with all the major frameworks.

You can query the AWS plugin with SQL, run named SQL queries from the command line, and embed SQL in controls. It’s SQL all the way down because we think that’s the best way to enable everyone to query cloud APIs and build compliance controls. Which compliance benchmarks are most important to you? Drop by our Slack community to collaborate with others on AWS security and compliance controls.

Resources

David Boeke

David Boeke

David Boeke is the CTO of Turbot. David has 25+ years of experience in IT and is recognized as a transformational leader who has enabled some of the world’s largest enterprise organizations to make the transition to public cloud. Prior to joining Turbot, David was the Global Head of Enterprise Architecture and led the cloud transformation for a Fortune 50 life sciences company.

Bob Tordella

Bob Tordella

Bob Tordella is the CRO of Turbot. He is recognized as a cloud governance leader who has enabled the world’s largest enterprise organizations to secure and optimize their public cloud environments. Bob is currently improving the way teams operate in the public cloud to discover and auto-resolve incidents using Turbot. He is also an advocate for Steampipe, an open source project that simplifies querying your cloud with SQL.

Jon Udell

Jon Udell

Jon Udell is the community lead for Steampipe.

Nathan Wallace

Nathan Wallace

Nathan Wallace is the Founder and CEO of Turbot. He is recognized as a transformational leader who has enabled some of the world’s largest enterprise organizations to make the transition to public cloud. Prior to starting Turbot, Nathan tackled these challenges head on as the Global Director of Cloud and DevOps for a Fortune 50 multinational pharmaceutical company.

The content and opinions in this post are those of the third-party author and AWS is not responsible for the content or accuracy of this post.

Tom Callaway

Tom Callaway

Tom is a Principal Open Source Evangelist for AWS. He has been a part of the open source community since 1997, when he skipped his last day of junior high to go to Linux Expo. During college, he worked for a high-availability startup to cover tuition, and when they crashed along with the majority of the IT sector, he dropped out of college and went to work for Red Hat full-time. He worked for Red Hat for almost twenty years, in Support, Sales Engineering, Release Engineering, Engineering Management, University Outreach (CTO’s office), and Employment Brand. He’s an active contributor to Fedora and helped to write the Fedora Packaging and Legal Guidelines which are still in use today. He’s spoken at a number of conferences and events including SxSW, OSCON, Open Source Summit, and Red Hat Summit. He has one patent on a crazy idea that never got implemented in the real world, and is co-author of Raspberry Pi Hacks (2013, O’Reilly). When he’s not working, he finds enjoyment in 3D printing, pinball, games (board & video), geocaching, craft beer, B-movies, science fiction, trivia, traveling, and his wife and two boys. He lives in Cary, NC. Tom is also known as “spot” by many people in the open source universe, he’s gone by that nickname since the 1st grade, and he happily answers to it. Follow him on Twitter @spotfoss.