Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC 31: Integration testing for CDK apps #378

Closed
wants to merge 16 commits into from
Closed

Conversation

nija-at
Copy link
Contributor

@nija-at nija-at commented Sep 3, 2021

This is a request for comments about Integration testing for CDK apps.
See #31 for additional details.

Readable version: https://github.com/aws/aws-cdk-rfcs/blob/nija-at/integ-tests/text/0031-integ-testing.md

APIs are signed off by @{BAR_RAISER}.


By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache-2.0 license

@nija-at nija-at self-assigned this Sep 3, 2021
Copy link
Contributor

@rix0rrr rix0rrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting approach. I'm a little concerned that expressivity might suffer, and I'm also a little concerned about the runtime performance of these tests.

Ideally you'd want to keep test cases separated, but as doing so will require a full CloudFormation cycle for each tests (~multiple minutes per test) people will NOT do that and combine as many assertions as they can into a single test to avoid the performance overhead. Leading to worse tests, imo...


### contest

🌩️ contest tests can verify that your CDK app can be deployed successfully on AWS with the desired resources.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we also taking care of cleanup? I think that'd be a selling point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I'll add.


## Working Backwards - README.md

### contest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haaaaaaaaaaaah!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of the name... I'd just call this cdk test/@aws-cdk/testing


* `assertAws`: execute AWS service APIs and assert the response.
* `invokeLambda`: invoke an AWS lambda function and assert the response.
* `curlApi`: Execute the 'curl' command on an API endpoint. Used typically to test API Gateway resources.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too low level? Maybe just httpRequest ?

text/0031-integ-testing.md Outdated Show resolved Hide resolved
text/0031-integ-testing.md Show resolved Hide resolved
text/0031-integ-testing.md Outdated Show resolved Hide resolved
Comment on lines 65 to 66
repoTest.assertAws.
s3.putObject({ bucket: stack.artifacts, key: '...', body: '...' });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like test setup, which makes the "assert" part of this statement read a bit awkwardly. I'm wondering if it should be more like test.aws.module.action(...) in general and then test.aws.module.action(...).assertReturns and test.aws.module.action(...).assertThrows for assertions. Of course the test should fail if any call fails

On the other hand, this may lead users to attempt to modify the return values of the SDK calls in the test file, which is not correct since the test file is executed at synthesis time, not at runtime. I think this will be the most confusing part of the paradigm you are describing here.

Maybe there is some way to make this look less like a normal assertions library (using object construction instead of function calls). The increased friction could actually be a good thing here since it would give users pause

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like test setup, which makes the "assert" part of this statement read a bit awkwardly.

Hmm yes, you make a good point. Maybe just dropping the "assert" phrasing in the API will make this cleaner. I'll take a stab at this in the next rev.

this may lead users to attempt to modify the return values of the SDK calls in the test file
I think this will be the most confusing part of the paradigm you are describing here.

Good observation. I came to the similar realization after I published this. There's further work needed here.

@nija-at
Copy link
Contributor Author

nija-at commented Sep 8, 2021

@rix0rrr

I'm a little concerned that expressivity might suffer

Can you expand?

also a little concerned about the runtime performance of these tests.

My approach was going to involve creating snapshots (app + assertions) for all tests by default and not "running" the test if the snapshots matched. Would this sufficiently address your concern?

Copy link

@skorfmann skorfmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this topic would be hugely beneficial beyond just AWS CDK. In CDK for Terraform were having similar thoughts and even beyond that, I think that whomever is provisioning AWS resources would benefit from having a strong testing support library which could be plugged into whatever testing framework people might be using.

It would be fantastic to have a polyglot testing support library, which abstracts away common use cases when interacting with AWS resources in test scenarios, like (off the top of my head, there's certainly more):

  • triggering a Lambda function
  • publishing / receiving messages (SQS, EventBridge)
  • invoking Cloudfront Functions and fetching results
  • Trigger StepFunction workflows / analyzing its state
  • Dynamically override (stub) Lambda Function (e.g. in StepFunctions or API Gateway)
  • AppSync resolvers
  • Easily access logs and traces

I'd leave the actual assertions to the respective testing framework, let them run the tests locally by default. There could still be an glue layer on top, which integrates this neatly into AWS CDK, CDK for Terraform, etc...

Two questions:

  • What is the motivation to run the tests as Lambda functions deployed via a Cloudformation Stack?
  • Do you see away to extract the test helpers into an independent library to make these available beyond AWS CDK?

This would imply a simpler and a more familiar experience of writing tests.

However, to achieve this effectively, the test cases will need to deploy the app it defines.
We do not have a way to deploy CDK applications programmatically.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deploying programmatically seems to be a low hanging fruit and it's something people are asking for as well #300 (there are similar requests in cdktf)

I actually built a jest test suite which deploys multiple AWS CDK test stack the other day - granted, it's all Typescript but the experience is quite good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we did get programmatic deploy-ability, this could work out.

As far as the AWS CDK goes, this will mean taking the 'deploy' logic in the AWS CDK CLI and converting it into an independent jsii module (so as to make this polyglot). I would not consider this low hanging fruit, but an rfc of its own.
However, this does not mean it's not worth pursuing.

@rix0rrr - given you know the most about the AWS CDK CLI, what do you think of this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely not an easy change to do this in a jsii-compatible way.

Of course, we could hack it and shell out to the CLI. But if we're going to do this, I'd be very interested in clearly defining and controlling the user experience. Having every test suite run wait for CFN deployments is not going to be great.

@nija-at
Copy link
Contributor Author

nija-at commented Sep 8, 2021

@skorfmann - thanks for your review.

What is the motivation to run the tests as Lambda functions deployed via a Cloudformation Stack?

A few things -

It allows for users, typically enterprise users, to test "private" resources, such as a lambda function in a private vpc.

The biggest problem with this type of testing is the time of test execution which is going to be really long. The mitigation to this is snapshot testing (this section is due in the rfc and I hope to have it in the next rev). If the snapshot includes the app and the test, then we run the test only if the app and the test have not changed.

It also allows for predictable test execution environments, which allows for stable assertion execution. Let's say, an assertion depends on the sed command. If this is run on the Mac, it will run BSD sed and on unix, this will be GNU sed, which have subtly different behaviours.
With a lambda function, the execution environment and how to bring additional dependencies is well known.

Finally, a nice side effect is that we can use existing CLIs to deploy and run these tests, without any additional runtime logic.

This is a great question. I'll need to feed this back into the RFC.

Do you see away to extract the test helpers into an independent library to make these available beyond AWS CDK?

What do you mean by test helpers?

text/0031-integ-testing.md Outdated Show resolved Hide resolved

## Working Backwards - README.md

### contest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of the name... I'd just call this cdk test/@aws-cdk/testing

text/0031-integ-testing.md Outdated Show resolved Hide resolved
text/0031-integ-testing.md Outdated Show resolved Hide resolved
@nija-at
Copy link
Contributor Author

nija-at commented Sep 9, 2021

I think this topic would be hugely beneficial beyond just AWS CDK. In CDK for Terraform were having similar thoughts and even beyond that, I think that whomever is provisioning AWS resources would benefit from having a strong testing support library which could be plugged into whatever testing framework people might be using.

@skorfmann - agreed that this would be hugely beneficial.

I'm currently looking to find the right solution that works for one CDK, specifically the AWS CDK. Once we arrive at a happy solution, I intend to see what parts of the solution can be expanded to benefit other CDKs.

test.invokeLambda(...).throws({ ... });
new AwsAssertion(test, 'GetObject', {
request: AwsAssertionCall.S3.getObject({ ... }),
returns: { Data: ... },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to support the use case of allowing chained calls (call 1: StartExecution, call 2: GetExecutionStatus(executionId = ...)), can we use the Capture model?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not focus too deeply on this. We can get into a discussion about the exact API modelling when we get to implement (and implementation design of) this.
We can get as innovative on this as needed over time or even build service specific Assertion classes that do chaining, etc.

The idea here is to capture the overall direction in terms of API, to see there are no major gaps, or innovation blockers.

I would like to focus this RFC on areas around the designs of the test execution framework, assertion model, interaction between CLI and the test frameworks, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure – this was I guess a poorly phrased way of saying: I would like to see first class support for call chaining in this design. I don't think it's explicitly mentioned but I hope it will be, since I think it's integral to the use of the framework.

Copy link
Contributor Author

@nija-at nija-at Sep 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now I see what you mean.

This feels like the feature of the aws-assertions "module". If we look at the aws assertions as its own piece separate from the core testing system, we need to build some form of chaining (something like Capture perhaps as you previously alluded to) into the aws assertions part.

It feels to me that this is part of the 'aws assertions' part, and not something first class in the 'core assertion' part of the system. But I'm also having a hard time judging that at this level - need to get my hands dirty to see how this will pan out.

Does that make sense?
Or do you see a piece that we need to add to the 'core assertion' part of the system that will enable this for aws assertions?

I do agree chaining is important for the aws assertions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I don't yet see the dividing line between "core" and "aws" assertions – I'm sure this is in the works as the RFC takes shape! However, it seems to me that regardless of the [infrastructure manager (AWS/Azure/...)/call provider (AWS SDK/HTTP/...)], call chaining would be very useful. Not being able to track "state" in the assertions seems to really restrict what kind of tests the library can support. But, again, this is based just off my assumptions about core vs. aws. Excited to see this progress!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've addressed this in the latest change.

@skorfmann
Copy link

Do you see away to extract the test helpers into an independent library to make these available beyond AWS CDK?

What do you mean by test helpers?

Was aiming at what @eladb mentioned here #378 (comment)

@skorfmann
Copy link

I like where this is going 👍 What are your thoughts on traceability and logs for tests and their assertions?

I've been playing around with jest and xray integration. My goal there was to get visibility into the resources and their potential logs, in the context of the tests they are running in. I still have to streamline this a bit, but then gonna publish a full code example.

The rough idea goes like this:

// the jest test function wrapped into an xray starting segment 
// A setup block deployed a testing stack already, that's where the 'output' is coming from
  xray.test("render index file", async () => {
    const post = await fs.readFile(path.join(__dirname, 'fixtures', 'simple.md'), 'utf8');
    const index = await fs.readFile(path.join(__dirname, 'fixtures', 'index.md'), 'utf8');
    const bucketArn = output.renderOlap

   // the AWS SDK clients are instrumented with xray
    await putObject(bucketArn, `posts/simple.md`, post, 'text/markdown');
    await putObject(bucketArn, `index.md`, index, 'text/markdown');
    const result = await getObject(bucketArn, `index.md`);
    expect(result.toString()).toMatchSnapshot();
  }, 20_000);

Which allows to see the test execution for each single test as a full xray segment, including downstream calls of the tested infrastructure components.

Screenshot 2021-09-22 at 11 17 48

And more importantly, it's possible to collect all (cloudwatch) logs of the individual testing xray segments - here's an excerpt from my logging output (there's a delay of roughly 5 - 10 seconds for the logs to show up after the tests finished)

Screenshot 2021-09-22 at 11 18 37

I know that this is not that easy to pull off across all testing frameworks / languages, but perhaps it might be possible to keep this in mind and enable users to achieve something similar when they want to.

This setup will enable defining a separate script that runs the apptest suite - npm script,
Ant target, Maven phase, etc. - by using `*.apptest.*` as the file filter in your test runner.

The test will be executed when the `run()` method is called on the `apptest.Test` class.
Copy link
Contributor Author

@nija-at nija-at Sep 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jsii (currently) does not support throwing error messages across languages.

Best to return a result that can be then asserted on. It may also be the best outcome to get good test reports.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean by that? You can throw an exception in JavaScript and it will propagate to the host language...

Comment on lines 270 to 271
This feature is beneficial to users who own a CDK app and want to run assertions on the deployed
app in their pre-production stages as part of a CI/CD pipeline.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So these tests are not deployed as part of the main application? If not, then they cannot trigger a rollback if any of the tests fail, which seems like a nice feature.

pre-production stages

Why not allow tests against production? Just curious :)

We are currently using step function and/or lambdas to run integration tests. Since they are part of the stack, they can roll it back if anything fails and we can run it against any deployment stage. Maybe this RFC isn't meant to address this use case?

This RFC is very interesting! It'll be helpful no matter what. Just trying to grasp how it is run and when (EG: how would this run in CodePipeline?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not allow tests against production? Just curious :)

You are right. Take a look at the latest revision. I've included this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think I caught your updates, looks like it is possible to include these in your main stack in order to trigger rollbacks, thanks! 👍

For when it isn't part of your main stack, what's the expected workflow in a pipeline like CodePipeline?

  • CloudFormation Action to deploy the main application that we want to test
  • CodeBuild that runs CDK deploy/destroy or Another CloudFormation Action to deploy... and maybe a CloudFormation Action to destroy? (or maybe just keep it? No real cost if all serverless tech)

Think my concern about CodeBuild is if the deployment isn't in the same account, we have to do some role assumes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the expected workflow in a pipeline like CodePipeline?

You would run the tests as you would any other tests in your pipeline, which is probably CodeBuild.

A future addendum to this RFC will address this when using CDK Pipelines and make this trivial. I've taken this out of scope to keep the RFC contained.

If you use your own CI/CD mechanism, you will need to set up cross-account assume role, yes.

@nija-at
Copy link
Contributor Author

nija-at commented Sep 29, 2021

@skorfmann -

What are your thoughts on traceability and logs for tests and their assertions?

This should be possible with the apptest framework being proposed here. To get into the details on this, this is an entire topic we can go very deep on, so I've left it out.

If we build the right abstraction for an 'assertion', we can build assertions based on logs and, perhaps even, metrics.

import { ArtifactRepo } from './records';
import { AwsAssertionCall, AwsAssertion, Test } from '@aws-cdk/apptest';

const repo = new ArtifactRepo(app, 'RequestRecord');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is app coming from?

returns: { Messages: [...] }
});

const testResult = repoTest.run();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is weird... What does run() do in this context?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It executes the test, i.e., deploys the stack with the construct and assertions, checks for its status, collects results and reports back.


const testResult = repoTest.run();
if (testResult.failures > 0) {
// notify the test runner to fail
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call jest.fail() for jest test or junit.fail() in junit test, etc.


## Working Backwards - README.md

### `@aws-cdk/apptest`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand the semantics "app test". This is not just for testing full applications... It can also test parts of apps (constructs). Something is still not accurate about the terminology.

Let's brainstorm on the name synchronously. It needs to be good.

Comment on lines +84 to +86
* `AwsAssertion`: execute AWS service APIs and assert the response.
* `InvokeLambda`: invoke an AWS Lambda Function and assert its response.
* `HttpRequest`: Execute an HTTP request against an API endpoint and assert the response.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming patterns are not consistent here (some use verbs, some use nouns). Also seems like all of them are "assertions", so either they all should have an Assertion suffix or none.

Here are some options:

  • AwsApiAssertion, LambdaAssertion, HttpAssertion
  • AwsApiRequest, LambdaInvoke, HttpRequest
  • AssertAwsApi, AssertLambda, AssertHttp

etc...

import { App } from 'cdk-deploy';

const app = new App('/path/to/cdk/app');
app.deploy();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you plan to implement this synchronously?

```ts
import { App } from 'cdk-deploy';

const app = new App('/path/to/cdk/app');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a path to a cloud assembly or to a directory with cdk.json? Kind of feels like it should accept cdk.App instance instead, no?


### Credentials

This module uses the AWS SDK for Javascript under the hood to communicate with CloudFormation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This module uses the AWS SDK for Javascript under the hood to communicate with CloudFormation.
This module uses the AWS SDK for Javascript under the hood to communicate with AWS services such as S3 and CloudFormation.

The AWS CDK CLI (aka toolkit) is responsible for deploying and destroying AWS CDK apps.
We will extract these two parts from the CLI into a separate jsii module.

The `run()` method on the class `apptest.Test` will depend on this module to deploy and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like this is a bit of a rabbit hole not directly needed to support integration tests. Why can't integration tests be simple CDK apps (like cdk-integ works) and then we can just use the CDK CLI (or a tool that wraps the CDK CLI) to deploy them?

Users will now need to learn how to organize apptest tests, learn how to execute it and read
results. We are not leveraging existing "well known" test frameworks.

Although this looks basic today, as the scope of apptest expands and more features are added,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the desire to reuse execution&reporting from existing frameworks but I am not sure this makes total sense because users won't be able to use any assertion functions and the programming experience will likely be confusion.

As an exercise, can you provide a more complete example/prototype of a jest-based test suite that uses this? I am curious how this will look?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the desire to reuse execution&reporting from existing frameworks but I am not sure this makes total sense because users won't be able to use any assertion functions and the programming experience will likely be confusion.

Why wouldn't it make sense to use native testing assertions, outside of the scope of the AWS CDK assertions?

});

new AwsAssertion(repoTest, 'MessageReceived', {
request: AwsAssertionCall.Sqs.receiveMessage({ queue: stack.publishNotifs }),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When does this assertion issue the first request? How does it know when to stop long polling? (Does it send exactly one request?)

How might I handle a situation where the message takes 5 minutes to arrive?


new AwsAssertion(repoTest, 'MessageReceived', {
request: AwsAssertionCall.Sqs.receiveMessage({ queue: stack.publishNotifs }),
returns: { Messages: [...] }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we supporting other than exact matches in returns? If the messages contain id numbers, I may not know those in advance, so I might want a language to match patterns.

Here's something like what we get in expect for jest:

new AwsAssertion(repoTest, 'MessageReceived', {
  request: AwsAssertionCall.Sqs.receiveMessage({ queue: stack.publishNotifs }),
  returns: Equals.objectContaining({
    Messages: Equals.arrayContaining([
      Equals.objectContaining({
        foo: 'bar'
      }),
    ]),
  }),
});

I'd find this useful for the HttpInvoke as well - on some of my websites, there's smoke if the front page doesn't contain a certain keyword. The test for that might look like:

new HttpInvoke(repoTest, 'MissingKeyword', {
  endpoint: 'https://example.com',
  responseBody: Equals.stringContaining('mykeyword'),
});

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can definitely support different configuration for these assertions. However, we're most likely going to start with some basic assertions and simple configurations and then build on as needed.

The model is also going to allow for users to bring their own assertions, so if something is not available in the assertion shipped by the CDK, you will be able to bring your own, as well as, publish your own construct libraries with assertions.

* `InvokeLambda`: invoke an AWS Lambda Function and assert its response.
* `HttpRequest`: Execute an HTTP request against an API endpoint and assert the response.
Used typically to test API Gateway resources.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about these assertions:

AwsAlarmStatusAssertion: a CloudWatch alarm does (or does not) transition to 'ALARM' in a given timeframe.
AwsStepFunctionsAssertion: execute a Step Functions state machine - when it's done, check for a successful execution status.

(The latter could enable the former - here's an example from a similar library)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants