Skip to content

Releases: aws/aws-parallelcluster

AWS ParallelCluster v3.13.0

01 Apr 20:39
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.13.0

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

DEPRECATIONS

  • This is the last ParallelCluster release supporting Ubuntu 20.04
    as Ubuntu 20.04 will be in End-Of-Standard-Support on May 2025.

ENHANCEMENTS

  • Add support for Ubuntu 24.04.
  • Add support for ap-southeast-7 region.
  • Disable unused services cups and wpa_supplicant from Official ParallelCluster AMIs to improve security.

CHANGES

  • Upgrade Slurm to version 24.05.7.
  • Upgrade NVIDIA driver to version 570.86.15 (from 550.127.08) for all OSs except AL2.
  • Upgrade CUDA Toolkit to version 12.8.0 (from 12.4.1) for all OSs except AL2.
  • Upgrade Python to 3.12.8 for all OSs except AL2 (from 3.9.20).
  • On Ubuntu 22.04, install the Nvidia driver with the same compiler version used to compile the kernel.
  • Upgrade aws-cfn-bootstrap to version 2.0-33.
  • Upgrade EFA installer to 1.38.0 (from 1.36.0).
    • Efa-driver: efa-2.13.0-1
    • Efa-config: efa-config-1.17-1
    • Efa-profile: efa-profile-1.7-1
    • Libfabric-aws: libfabric-aws-1.22.0-1
    • Rdma-core: rdma-core-54.0-1
    • Open MPI: openmpi40-aws-4.1.7-1 and openmpi50-aws-5.0.5
  • Upgrade amazon-efs-utils to version 2.1.0.
  • Remove third-party cookbook: apt-7.5.22 and pyenv-4.2.3.
  • Upgrade third-party cookbook dependencies:
    • line-4.5.21 (from line-4.5.13)
    • nfs-5.1.5 (from nfs-5.1.2)
    • openssh-2.11.14 (from openssh-2.11.12)
    • yum-7.4.20 (from yum-7.4.13)
    • yum-epel-5.0.8 (from yum-epel-5.0.2)
  • Upgrade Pmix to 5.0.6 (from 5.0.3).
  • Upgrade ARM PL to version 24.10 (from 23.10).
  • Upgrade Python to version 3.12.8 (from 3.9.17) in Lambda layer and installer.
  • Upgrade NodeJS to version 20.18.3 (from 18.20.3) in Lambda layer and installer.
  • Remove generation of DSA keys for login nodes as DSA, which became unsupported in OpenSSH 9.7+.
  • Set instance ID and instance type information in Slurm upon compute nodes launch.
  • Install NVIDIA drivers without the option 'no-cc-version-check', which is now deprecated in the NVIDIA installer.
  • Add validator to enforce up to 10- login node pools.
  • Update the default root volume size to 45 GB.
  • Increase HeadNodeBootstrapTimeout by 5 minutes, making it 35 minutes in total.

BUG FIXES

  • Remove usage of cfn-init for compute node bootstrapping to reduce node scale up time.
  • Fix an issue causing compute node bootstrap failure when a proxy is used.
  • On Ubuntu 22.04, install the Nvidia driver with the same compiler version used to compile the kernel
    to prevent installation failures.- Fix the execution of overriding aws-parallelcluster-node package only on the head node during update.
  • Fix an issue where containerized jobs executed through Pyxis/Enroot in a multi-user environment (integrated with Active Directory) would fail.
  • Fix usage of authselect causing node bootstrap failures on Rocky 9.5+ when directory service is used.

AWS ParallelCluster v3.12.0

18 Dec 22:10
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.12.0

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

ENHANCEMENTS

  • Add new build image configuration section Build/Installation to turn on/off Nvidia software and Lustre client installations. By default, Nvidia software, although included in official ParallelCluster AMIs, is not installed by build-image. By default, Lustre client is installed.
  • The CLI commands export-cluster-logs and export-image-logs can now by default export the logs to the default ParallelCluster bucket or to the CustomS3Bucket if specified in the config.
  • Extend Amazon DCV support to Ubuntu2204 on ARM instances.

CHANGES

  • Upgrade NVIDIA driver to version 550.127.08 (from 550.90.07). This addresses a known issue from Nivdia.
  • Upgrade Amazon DCV to version 2024.0-18131.
    • server: 2024.0-18131-1
    • xdcv: 2024.0.631-1
    • gl: 2024.0.1078-1
    • web_viewer: 2024.0-18131-1
  • Upgrade EFA installer to 1.36.0.
    • Efa-driver: efa-2.13.0-1
    • Efa-config: efa-config-1.17-1
    • Efa-profile: efa-profile-1.7-1
    • Libfabric-aws: libfabric-aws-1.22.0-1
    • Rdma-core: rdma-core-54.0-1
    • Open MPI: openmpi40-aws-4.1.7-1 and openmpi50-aws-5.0.5
  • Auto-restart slurmctld on failure.
  • Upgrade mysql-community-client to version 8.0.39.
  • Remove support for Python 3.7 and 3.8, which are in end of life.

BUG FIXES

  • Fix an issue where changes in sequence of custom actions scripts were not detected during cluster updates.
  • Add missing permissions for ParallelCluster API to create the service linked roles for Elastic Load Balancing and Auto Scaling, that are required to deploy login nodes.
  • Fix an issue in the way we get region when manage volumes so that it can correctly handle local zone.
  • Fix an issue where adding EFS filesystems with AccessPointIds during an update would fail.
  • Fix an issue where when using PCAPI, cluster update could fail when updating a parameter that is not type String (e.g. MaxCount).
  • When mounting an external OpenZFS, it is no longer required to set the outbound rules for ports 111, 2049, 20001, 20002, 20003.

AWS ParallelCluster v3.11.1

21 Oct 16:54
c877343
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.11.1

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

CHANGES

  • Pyxis is now disabled by default, so it must be manually enabled as documented in the product documentation.
  • Upgrade Python runtime to version 3.12 in ParallelCluster Lambda Layer.
  • Remove version pinning for setuptools to version prior to 70.0.0.
  • Upgrade libjwt to version 1.17.0.

BUG FIXES

  • Fix an issue in the way we configure the Pyxis Slurm plugin in ParallelCluster that can lead to job submission failures.
    #6459
  • Add missing permissions required by login nodes to the public template of policies.

AWS ParallelCluster v3.11.0

26 Sep 18:26
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.11.0

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

ENHANCEMENTS

  • Add support for custom actions on login nodes.
  • Allow DCV connection to login nodes.
  • Add support for ap-southeast-3 region.
  • Add security groups to login node network load balancer.
  • Add AllowedIps configuration for login nodes.
  • Add new configuration SharedStorage/EfsSettings/AccessPointId to specify an optional EFS access point for a mount
  • Allow up to 10 login node pools.
  • Install enroot and pyxis in official pcluster AMIs

CHANGES

  • [BREAKING] The loginNodes field returned by the API DescribeCluster and the CLI command describe-cluster
    has been changed from a dictionary to an array to support multiple pools of login nodes.
    This change breaks backward compatibility, making these operations incompatible with clusters deployed with older versions.
  • Upgrade Slurm to 23.11.10 (from 23.11.7).
  • Upgrade Pmix to 5.0.3 (from 5.0.2).
  • Upgrade EFA installer to 1.34.0.
    • Efa-driver: efa-2.10.0-1
    • Efa-config: efa-config-1.17-1
    • Efa-profile: efa-profile-1.7-1
    • Libfabric-aws: libfabric-aws-1.22.0-1
    • Rdma-core: rdma-core-52.0-1
    • Open MPI: openmpi40-aws-4.1.6-3 and openmpi50-aws-5.0.3-11
  • Upgrade NVIDIA driver to version 550.90.07 (from 535.183.01).
  • Upgrade CUDA Toolkit to version 12.4.1 (from 12.2.2).
  • Upgrade Python to 3.9.20 (from 3.9.19).
  • Upgrade Intel MPI Library to 2021.13.1.769 (from 2021.12.1.8).

BUG FIXES

  • Fix validator EfaPlacementGroupValidator so that it does not suggest to configure a Placement Group when Capacity Blocks are used.
  • Fix occasional cluster creation failures by ensuring that FSx for Lustre file systems are created after security group rules.
  • Fix cluster deletion failure when placement group is enabled.
  • Fix issue with login nodes being marked unhealthy when restricting SSH access.
  • Fix retrieve_supported_regions so that it can get the correct S3 url.
  • Fix describe_images to use pagination.
  • Fix No route tables found bug when specifying default VPC subnet to LoginNodes/Networking/SubnetIds.

AWS ParallelCluster v3.10.1

08 Jul 20:05
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.10.1

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

BUG FIXES

  • Fix image build failure in China regions.

AWS ParallelCluster v3.10.0

27 Jun 21:42
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.10.0

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

ENHANCEMENTS

  • Add new configuration section Scheduling/SlurmSettings/ExternalSlurmdbd to connect the cluster to an external Slurmdbd.
  • Allow build-image to be run in an isolated network.
  • Add support for Amazon Linux 2023.
  • Add support for price-capacity-optimized as an AllocationStrategy.
  • Add validator to prevent the use of Placement Groups with Capacity Blocks.

CHANGES

  • CentOS 7 is no longer supported.
  • Upgrade Cinc Client to version to 18.4.12 from 18.2.7.
  • Upgrade munge to version 0.5.16 (from 0.5.15).
  • Upgrade Pmix to 5.0.2 (from 4.2.9).
  • Upgrade third-party cookbook dependencies:
    • apt-7.5.22 (from apt-7.5.14)
    • openssh-2.11.12 (from openssh-2.11.3)
  • Remove third-party cookbook: selinux-6.1.12.
  • Upgrade EFA installer to 1.32.0.
    • Efa-driver: efa-2.8.0-1
    • Efa-config: efa-config-1.16-1
    • Efa-profile: efa-profile-1.7-1
    • Libfabric-aws: libfabric-aws-1.21.0-1
    • Rdma-core: rdma-core-50.0-1
    • Open MPI: openmpi40-aws-4.1.6-3 and openmpi50-aws-5.0.2-12
  • Upgrade NVIDIA driver to version 535.183.01 (from 535.154.05).
  • Upgrade Python to 3.9.19 (from 3.9.17).
  • Upgrade Intel MPI Library to 2021.12.1.8 (from 2021.9.0.43482).

BUG FIXES

  • Fix Data Repository Associations configuration to make AutoExportPolicy and AutoImportPolicy optional.
  • Fixed an issue during cluster deletion that now completes compute fleet cleanup when instances are either in shutting-down or terminated state.
    This is to avoid cluster deletion failures for instance types with longer termination cycles.
  • Allow cloudwatch dashboard to be enabled and alarms to be disabled in the Monitoring section of the cluster config.
  • Allow ParallelCluster Custom Resource to suppress validators using PclusterCluster/SuppressValidators.
  • Removing /etc/profile.d/pcluster.sh so that it's not executed at every user login and
    cfn_bootstrap_virtualenv is not added in PATH environment variable.
  • Fix ParallelCluster API spec by replacing field failureReason with failures in DescribeCluster response.
  • Fix ParallelCluster API spec by adding the CloudFormation stack status that were missing:
    IMPORT_*, REVIEW_IN_PROGRESS and UPDATE_FAILED.
  • Fix an issue that prevented cluster updates from including EFS filesystems with encryption in transit.
  • Fix an issue that prevented slurmctld and slurmdbd services from restarting on head node reboot when
    EFS is used for shared internal data.
  • On Ubuntu systems, remove default logrotate configuration for cloud-init log files that clashed with the
    configuration coming from Parallelcluster.
  • Fix image build failure with RHEL 8.10 or newer.

AWS ParallelCluster v3.9.3

19 Jun 12:19
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.9.3

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

ENHANCEMENTS

  • Add support for FSx Lustre as a shared storage type in us-iso-east-1.

BUG FIXES

  • Remove cloud_dns from the SlurmctldParameters in the Slurm config to avoid Slurm fanout issues.
    This is also not required since we set the IP addresses on instance launch.

AWS ParallelCluster v3.9.2

28 May 19:20
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.9.2

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

CHANGES

  • Upgrade Slurm to 23.11.7 (from 23.11.4).

AWS ParallelCluster v3.9.1

11 Apr 10:42
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.9.1

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

BUG FIXES

  • Remove recursive deletion of shared storage mountdir when unmounting filesystems as part of update-cluster operation.

AWS ParallelCluster v3.9.0

12 Mar 01:27
0303ec9
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.9.0

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

ENHANCEMENTS

  • Permit to update the external shared storage of type Efs, FsxLustre, FsxOntap, FsxOpenZfs and FileCache
    without replacing compute and login fleet.
  • Permit to update MinCount, MaxCount, Queue and ComputeResource configuration parameters without the need to
    stop the compute fleet. It's now possible to update them by setting Scheduling/SlurmSettings/QueueUpdateStrategy
    to TERMINATE. ParallelCluster will terminate only the nodes removed during a resize of the cluster capacity
    performed through a cluster update.
  • Add support for RHEL9.
  • Add support for Rocky Linux 9 as CustomAmi created through build-image process. No public official ParallelCluster Rocky9 Linux AMI is made available at this time.
  • Remove CommunicationParameters from the Custom Slurm Settings deny list.
  • Add the configuration parameter DeploymentSettings/DefaultUserHome to allow users to move the default user's home directory to /local/home instead of /home (default).
  • Add configuration parameter DeploymentSettings/DisableSudoAccessForDefaultUser to disable sudo access of default user in supported OSes.

CHANGES

  • Upgrade Slurm to 23.11.4 (from 23.02.7).
    • Upgrade Pmix to 4.2.9 (from 4.2.6).
  • Add support for Python 3.11, 3.12 in pcluster CLI and aws-parallelcluster-batch-cli.
  • Build network interfaces using network card index from NetworkCardIndex list of EC2 DescribeInstances response,
    instead of looping over MaximumNetworkCards range.
  • Fail cluster creation when using instance types P3, G3, P2 and G2 because their GPU architecture is not compatible with Open Source Nvidia Drivers (OpenRM) introduced as part of 3.8.0 release.
  • Upgrade the default FSx Lustre server version managed by ParallelCluster to 2.15.
  • Upgrade NVIDIA driver to version 535.154.05.
  • Upgrade EFA installer to 1.30.0.
    • Efa-driver: efa-2.6.0-1
    • Efa-config: efa-config-1.15-1
    • Efa-profile: efa-profile-1.6-1
    • Libfabric-aws: libfabric-aws-1.19.0
    • Rdma-core: rdma-core-46.0-1
    • Open MPI: openmpi40-aws-4.1.6-2 and openmpi50-aws-5.0.0-11
  • Upgrade NICE DCV to version 2023.1-16388.
    • server: 2023.1.16388-1
    • xdcv: 2023.1.565-1
    • gl: 2023.1.1047-1
    • web_viewer: 2023.1.16388-1
  • Upgrade ARM PL to version 23.10.
  • Upgrade third-party cookbook dependencies:
    • nfs-5.1.2 (from nfs-5.0.0)

BUG FIXES

  • Refactor IAM policies defined in CloudFormation template parallelclutser-policies.yaml to prevent ParallelCluster API deployment failure caused by policies exceeding IAM limits.
  • Fix issue making job fail when submitted as active directory user from login nodes. The issue was caused by an incomplete configuration of the integration with the external Active Directory on the head node.
  • Fix issue making login nodes fail to bootstrap when the head node takes more time than expected in writing keys.