AWS HPC Blog

Custom AMIs with ParallelCluster 3

You may often spend hours, days (or even weeks) as a sys-admin working to get your workflows and simulations to run. These would mainly involve installing all the required tools and configuring them with just the right set of parameters to get successful outcomes.

An Amazon Machine Image (AMI) is a great way to snapshot and reuse some of these time-consuming and detailed efforts. An AMI can be reused later on your chosen EC2 instances to get your fleet up and running quickly, so your workflows and simulations run in a consistent way each time.

Our recent launch of ParallelCluster 3 included a lot of new functionality, including tools for building on your existing AMIs when creating your clusters (or for creating a custom AMI from scratch if you’re just starting out).

Today we want to walk you through this new AMI creation and management process in ParallelCluster 3, which is built using EC2 Image Builder.

AWS ParallelCluster, meet EC2 Image Builder

The new AMI build process lets you specify a set of build components that are layered on top of ParallelCluster-provided AMIs (or your own AMI) to create pipelines for building those custom AMIs.

This is important, because now you can create your own build components, or use shared components created by others, inside or outside your organization. You can then reuse, modify, or mix these components, to suit different workloads or to create new AMIs compatible with different releases of ParallelCluster.

By creating these in pipelines, you’re freeing yourself from a lot of future heavy lifting when small things change.

ParallelCluster 3 can now help you to manage the whole lifecycle of these custom AMIs. You can list and describe previously-created custom AMIs, or even delete them when they’re not needed anymore. This is useful for just cataloging what you have, and keeping track of all the AMIs you create.

AMIs created through ParallelCluster are also automatically tagged to give you another dimension for organizing them. You can retrieve the build configuration that was used to create a specific custom AMI and know the version of software components baked into the AMI itself. This will save you when you want to refer back to this information later, when you’re planning a new AMI, reusing components from older ones.

Outline

Today we’ll show you how to build a custom ParallelCluster 3 AMI using this new process.

We’ll first show you how to create a ParallelCluster 3 custom AMI from a running EC2 instance or existing AMI which has all your software tools and applications already installed.

Next, we’ll show you how to create your own build component to be used as a layer on top of ParallelCluster’s own AMI software stack.

Figure 1 – Build paths to get to ParallelCluster custom image.

Building from an existing Amazon EC2 instance or AMI

You may have already installed and configured your applications on a running Amazon EC2 instance, and now you want to scale-out using AWS ParallelCluster. It’s possible to install the ParallelCluster software stack on top of your own tooling, but this requires you to start from an existing AMI. If the running EC2 instance has modifications that are not present in the original AMI where it came from, you’ll want to create a new AMI from it.

There are two ways to create an AMI from a running EC2 instance, using the AWS CLI or the AWS management console. Before you do this, know that EC2 will attempt to shut down the instance before creating the image. This is a good thing, since ParallelCluster uses your image from a cold start when it scales to a cluster. If there’s a risk that rebooting is going to result in a non-working instance, be sure to figure out how to fix those errors prior to creating a base image for installing ParallelCluster on top of it. Also, warn anyone using the instance before you forcibly log them out.

To do this in the CLI, use the create-image command:

aws ec2 create-image \
    --instance-id i-1234567890abcdef0 \
    --name "My AWSome HPC node" \
    --description "An AMI for my HPC software"

You can get the instance-id from the AWS management console or by using the aws ec2 describe-instances. You should see an ImageId in the output:

{
    "ImageId": "ami-01d7dcccb80665a0f"
}

If you want to use the AWS management console, refer to the AWS documentation for creating an EC2 image.

Both methods will provide you an ImageId that you can use as the parent image for the ParallelCluster custom image creation process.

Starting from an AMI

If you already have an AMI that has your applications and tools installed and you’d like to use that as a starting point, then you just need to fetch it’s ImageId. For example, you may want to build a ParallelCluster AMI starting from the Deep Learning AMI which already has TensorFlow and its dependencies installed (and it’s maintained by the Deep Learning team at AWS).

Build your ParallelCluster AMI

With your ImageId, the next step is to create a ParallelCluster build image configuration file to define the custom AMI build process. This is as easy as creating a file like the one shown in the following code sample. We’re using the ImageId of the Deep Learning AMI Ubuntu18 Version 42.0 in the AWS Region eu-south-1.

Region: eu-south-1
Image:
  Name: My First Custom AMI for PCluster 3.0.0
Build:
  ParentImage: ami-0aeac2750275f50b9
  InstanceType: c5.4xlarge

In the configuration, you can specify additional configuration options, like the properties of the root volume (e.g. size or encryption), and you can set custom tags on the built image, too. Refer to the official documentation for a list of options.

The next step is to trigger the ParallelCluster build-image process using the pcluster build-image CLI command. We specify an id for the desired custom AMI and the name of the build configuration file we just created.

pcluster build-image --image-id myFirstCustomImage --image-configuration my-build-config.yaml

The ParallelCluster build image process will create a AWS CloudFormation stack with everything it needs to accomplish the build of your custom AMI. You can monitor the status of the build process using the pcluster describe-image command. The whole process might take around 60 minutes or so, at the end of which all the resources used for the build will be deleted.

At the end of the build process, we use the describe-image command again to see the properties of the freshly built custom AMI. For example, you can retrieve the configuration used to build it, get the URL where the configuration is stored, or even find out the kernel version of the OS used for the image.

Here’s an example of querying for the value of the ImageId of the custom image we just created using the describe-image ParallelCluster command:

pcluster describe-image --image-id myFirstCustomImage --query 'ec2AmiInfo.amiId'

Once you’ve built your custom ParallelCluster AMI, you can use it to create a cluster. To do that, set the property in the Image section of the ParallelCluster config file, by specifying the AMI id as the value for the CustomAmi parameter.

Build Custom ParallelCluster AMI from Custom Build Component

Another approach to build your custom ParallelCluster AMI is to define your own build components to be layered on top of the ParallelCluster software stack. You can create multiple build components, reuse, and mix them or even use a component shared and published by others, as long as you are able to access the component through its Amazon Resource Name (ARN).

As an example, we’ll show you how to create a build component that installs Spack (a popular open-source package manager for HPC applications) and then use it to install an HPC application on the AMI.

First, we’ll need to write the component definition into a YAML file and create an EC2 Image Builder component through the CreateComponent API. You can do this with the AWS CLI using the command aws imagebuilder create-component. The following example shows the YAML file with the Spack installation steps, plus the way we use Spack to install three popular HPC applications: GROMACS, OpenFOAM and LAMMPS.

name: spack Installation
description: This is a sample component to show how to install spack.
schemaVersion: 1.0

phases:
  - name: build
    steps:
      - name: spackInstallation
        action: ExecuteBash
        inputs:
          commands:
            - |
              set -v
              
              # Install latest spack release
              export SPACK_ROOT=/opt/spack
              mkdir -p ${SPACK_ROOT}
              git clone https://github.com/spack/spack.git ${SPACK_ROOT}
              cd ${SPACK_ROOT}
              echo "export SPACK_ROOT=$SPACK_ROOT" > /etc/profile.d/spack.sh
              echo "source $SPACK_ROOT/share/spack/setup-env.sh" >> /etc/profile.d/spack.sh
              source ${SPACK_ROOT}/share/spack/setup-env.sh
              
              # Install some spack packages
              spack install gromacs
              spack install openfoam
              spack install lampps

To know more about how to build an individual EC2 Image Builder component refer to the official documentation.

To create your component, you just need to upload the file into a S3 bucket

aws s3 cp myCustomComponent.yaml s3://my-bucket

and then run the command:

aws imagebuilder create-component \
    --region eu-south-1 \
    –-name myCustomComponent \
    –-semantic-version 1.0.0 \
    –-platform Linux \
    –-uri s3://my-bucket/myCustomComponent.yaml

Once you’ve done this, record the component build version ARN which we’ll need to use with the build image configuration file in our next step.

Like before, we need to create an image configuration file, but this time we’ll specify some additional parameters to point to the component we’d like to include in the AMI creation. You can see the image configuration file for creating the ParallelCluster custom AMI in the following code block. Notice that this time, for the parent image, we’ve used the official ParallelCluster 3.0.0 AMI for Amazon Linux 2.

Region: eu-south-1
Image:
  Name: My Second Custom AMI for PCluster 3.0.0
  RootVolume:
    Size: 100
Build:
  ParentImage: ami-024fa643b712b31f2
  InstanceType: c5.4xlarge
  Components:
    - Type: arn
      Value: arn:aws:imagebuilder:eu-south-1:346106133209:component/mycustomcomponent/1.0.0/1

You can now easily retrieve the EC2 image ID of all the official ParallelCluster images using the command pcluster list-official-images.

To include the build component we created in the previous section as part of the AMI creation process, we set the ARN of the build component in the Value parameter under the Components section of the file, and to ensure the root volume is large enough to accommodate the additional software installed by our build component, we specify a root volume size.

Our final step is to trigger the ParallelCluster build image process, as we did before, using the command pcluster build-image.

pcluster build-image --image-id mySecondCustomImage --image-configuration my-second-build-config.yaml 

Once the build image process is done, you’ll again need to retrieve the ImageId of your custom AMI using the pcluster describe-image command. You can use this AMI to build your HPC cluster.

This time, your cluster configuration requires you to specify a larger root volume size for both Head and Compute nodes – at least equal to the size you specified in your image configuration file, for example:

Region: eu-south-1
Image:
  Os: alinux2
  CustomAmi: ami-0dbd3cb32d7c60187
HeadNode:
  InstanceType: c5.xlarge
  Networking:
    SubnetId: &subnet subnet-abcde12
  Ssh:
    KeyName: my-ssh-key
  LocalStorage:
    RootVolume:
      Size: 100
Scheduling:
  Scheduler: slurm
  SlurmQueues:
  - Name: myq
    ComputeResources:
    - Name: c5
      InstanceType: c5.xlarge
      MinCount:0
  Networking:
    SubnetIds:
    - *subnet
  ComputeSettings:
    LocalStorage:
      RootVolume:
        Size: 100

Refer to official doc for HeadNode and ComputeNode respectively for additional details.

Custom image management

You may want to see all the custom images you built earlier. You can retrieve the list of existing custom images you have built in the past using the pcluster list-images command.

When you don’t need a custom AMI anymore, you can delete it using the pcluster delete-image command. Notice that this command doesn’t just deregister the EC2 Image, it also takes care of deleting the corresponding snapshots freeing up resources and saving you some money.

Conclusion

In this post, we walked through the process of creating a custom AMIs for ParallelCluster 3. We showed how easy is to keep track of the AMIs you create and track down information showing how the AMIs were created, from the build configuration file to the list of components used to construct the AMI itself.

We described how to build a custom AMI starting from a running instance or AMI with your software already installed. And we showed how to create a build component that you can use as a layer on top of the custom AMIs ParallelCluster uses.

To get started creating your own custom AMIs, refer to the official ParallelCluster documentation about Building a Custom AWS ParallelCluster AMI. For information on leveraging the advanced capabilities of EC2 Image Builder components, refer to the AWS Task Orchestrator and Executor component manager documentation.

Luca Carrogu

Luca Carrogu

Luca Carrogu is a Software Development Engineer at Amazon Web Services. He started out working on HPC since 2009 for Nice Software and then joined AWS in 2016. He works on the development of AWS ParallelCluster, a product that makes easier to deploy and manage High Performance Computing (HPC) clusters on the AWS cloud.