AWS Storage Blog

Persistent storage for Kubernetes 

Stateful applications rely on data being persisted and retrieved to run properly. When running stateful applications using Kubernetes, state needs to be persisted regardless of container, pod, or node crashes or terminations. This requires persistent storage, that is, storage that lives beyond the lifetime of the container, pod, or node.

In this blog, we cover persistent storage concepts for Kubernetes environments and storage options in the Kubernetes world. We walk through designing and building a stateful application on Kubernetes that mitigates concern for data loss in case of failures or terminations at the host/pod/container level.

Persistent storage for Kubernetes is a complex topic, especially for someone who is new to storage and is getting started with Kubernetes. For this reason, we created this series of two blogs posts that will go through concepts and terminologies first and then dive into a practical use case.

Data persistence in Kubernetes

When running a stateful application, and without persistent storage, data is tied to the lifecycle of the pod or container. If a pod crashes or is terminated, data is lost.

Kubernetes and Pods

To prevent this data loss and run a stateful application on Kubernetes, we need to adhere to three simple storage requirements:

  1. Storage must not depend on the pod lifecycle.
  2. Storage must be available from all pods and nodes in the Kubernetes cluster.
  3. Storage must be highly available regardless of crashes or application failures.

Kubernetes volumes

Kubernetes has several types of storage options available, not all of which are persistent.

Ephemeral storage

Containers can use the temporary filesystem (tmpfs) to read and write files. However, ephemeral storage does not satisfy the three storage requirements. In case of a container crash, the temporary filesystem is lost—the container starts with a clean slate again. Also, multiple containers cannot share a temporary filesystem.

Ephemeral volumes

An ephemeral Kubernetes Volume solves both of the problems faced with ephemeral storage. An ephemeral Volume‘s lifetime is coupled to the Pod. It enables safe container restarts and sharing of data between containers within a Pod. However as soon as the Pod is deleted, the Volume is deleted as well, so it still does not fulfill our three requirements.

Pods and Volumes

The temporary file system is tied to the lifecycle of the container; the ephemeral Volume is tied to the lifecycle of the pod

Decoupling pods from the storage: Persistent Volumes

Kubernetes also supports Persistent Volumes. With Persistent Volumes, data is persisted regardless of the lifecycle of the application, container, Pod, Node, or even the cluster itself. Persistent Volumes fulfill the three requirements outlined earlier.

A Persistent Volume (PV) object represents a storage volume that is used to persist application data. A PV has its own lifecycle, separate from the lifecycle of Kubernetes Pods.

A PV essentially consists of two different things:

  • A backend technology called a PersistentVolume
  • An access mode, which tells Kubernetes how the volume should be mounted.

Backend technology

A PV is an abstract component, and the actual physical storage must come from somewhere. Here are a few examples:

  • csi: Container Storage Interface (CSI) → (for example, Amazon EFS, Amazon EBS, Amazon FSx, etc.)
  • iscsi: iSCSI (SCSI over IP) storage
  • local: Local storage devices mounted on nodes
  • nfs: Network File System (NFS) storage

Kubernetes is versatile and supports many different types of PVs. Kubernetes does not care about the underlying storage internals; it just gives us the PV component as an interface to the actual storage.

There are three major benefits to a PV:

  • A PV is not bound to the lifecycle of a Pod: when removing a Pod that is attached to a PV object, the PV will survive.
  • The preceding statement is also valid when a Pod crashes: the PV object will survive the fault and not be removed from the cluster.
  • A PV is cluster-wide: it can be attached to any Pod running on any Node in the cluster.

All different backend storage technologies have their own performance characteristics and tradeoffs. For this reason, we see different types of PVs in production Kubernetes environments that depend on the application.

Access mode

The access mode is set during PV creation and tells Kubernetes how the volume should be mounted. Persistent Volumes support three access modes:

  • ReadWriteOnce: Volume allows read/write by only one node at the same time.
  • ReadOnlyMany: Volume allows read-only mode by many nodes at the same time.
  • ReadWriteMany: Volume allows read/write by multiple nodes at the same time.

Not all PersistentVolume types support all access modes.

Persistent volume claims

A Persistent Volume (PV) represents an actual storage volume. Kubernetes has an additional layer of abstraction necessary for attaching a PV to a Pod: the PersistentVolumeClaim (PVC).

A PV represents the actual storage volume, and the PVC represents the request for storage that a Pod makes to get the actual storage.

The separation between PV and PVC relates to the idea that there are two types of people in a Kubernetes environment:

  • Kubernetes administrator: this person is supposed to maintain the cluster, operate it, and add computational resources such as persistent storage.
  • Kubernetes application developer: this person is supposed to develop and deploy the application.

Put simply, the developer consumes the computational resources offered by the administrator. Kubernetes was built with the idea that a PV object should belong to the cluster administrator scope, whereas a PVC object belong to the application developer scope.

Essentially, a Pod cannot mount a PV object directly. It needs to explicitly ask for it. And that asking action is achieved by creating a PVC object and attaching it to the Pod. This is the only reason why this additional layer of abstraction exists. PVCs and PVs have a one-to-one mapping (a PV can only be associated with a single PVC).

Persistent Volume and Persistent Volume Claim

This blog post includes a demo of this process of attaching persistent storage to a Pod, but before that, we need to provide some background on CSI drivers.

Container Storage Interface (CSI) drivers

The Container Storage Interface (CSI) is an abstraction designed to facilitate using different storage solutions with Kubernetes. Different storage vendors can develop their own drivers that implement the CSI standards, enabling their storage solutions to work with Kubernetes (regardless of the internals of the underlying storage solution). AWS has CSI plugins for Amazon EBSAmazon EFS , and Amazon FSx for Lustre.

Static provisioning

In what we described in the “Persistent volume claims” section, first the administrator creates one or more PV, and then the application developer creates a PVC. This is called static provisioning. It is static because you have to manually create the PV and the PVC in Kubernetes. At scale this can become more and more difficult to manage, especially if you are managing hundreds of PVs and PVCs.

Let’s say you are creating an Amazon EFS file system to mount it as a PV object, and would to like to go with static provisioning. You would need to do the following:

  • Kubernetes administrator’s task
    1. Create an Amazon EFS file system volume.
    2. Copy and paste its filesystem ID to a PV YAML definition file.
    3. Create the PV using a YAML file.
  • Kubernetes application developer’s task
    1. Create a PVC to claim this PV.
    2. Mount the PVC to the Pod object in the Pod YAML definition file.

This works, but would become time consuming to do at scale.

Dynamic provisioning

With dynamic provisioning, you do not have to create a PV object. Instead, it will be automatically created under the hood when you create the PVC. Kubernetes does so using another object called Storage Class.

A Storage Class is an abstraction that defines a class of backend persistent storage (for example, Amazon EFS file storage, Amazon EBS block storage, etc.) used for container applications.

Storage Class essentially contains two things:

  1. Name: This is the name, which uniquely identifies the storage class object.
  2. Provisioner: This defines the underlying storage technology. For example, provisioner would be efs.csi.aws.com for Amazon EFS or ebs.csi.aws.com for Amazon EBS.

The Storage Class objects are the reason why Kubernetes is capable of dealing with so many different storage technologies. From a Pod perspective, no matter whether it is an EFS volume, EBS volume, NFS drive, or anything else, the Pod will only see a PVC object. All the underlying logic dealing with the actual storage technology is implemented by the provisioner the Storage Class object uses.

Dynamic Provisioning

Demo of static provisioning and dynamic provisioning With Amazon EKS and Amazon EFS

Now, let’s put all of the learning into action. Refer to this GitHub page to set up the working environment to follow along with this demo section.

You can see in the following code snippet an Amazon EKS cluster with five nodes:

$ kubectl get nodes
NAME                                           STATUS   ROLES    AGE    VERSION
ip-192-168-12-218.us-west-1.compute.internal   Ready    <none>   2d3h   v1.21.5-eks-9017834
ip-192-168-24-116.us-west-1.compute.internal   Ready    <none>   2d3h   v1.21.5-eks-9017834
ip-192-168-46-22.us-west-1.compute.internal    Ready    <none>   2d3h   v1.21.5-eks-9017834
ip-192-168-63-240.us-west-1.compute.internal   Ready    <none>   2d3h   v1.21.5-eks-9017834
ip-192-168-7-195.us-west-1.compute.internal    Ready    <none>   2d3h   v1.21.5-eks-9017834

We use an Amazon EFS file system as the persistent storage for our Kubernetes cluster. And for that, we need to first install the Amazon EFS CSI driver.

Static provisioning using Amazon EFS

Let’s create an Amazon EFS file system first (myEFS1) in the AWS Management Console and keep a note of the FileSystemId, since we will need this while creating the PV in the next step.

Create an EFS File System

Keep everything as default and create the file system.

Next, we create the PV manifest file and provide the FileSystemId of the newly created file system.

#pv.yaml
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: efs-pv
spec:
  capacity:
    storage: 5Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  storageClassName: ""
  persistentVolumeReclaimPolicy: Retain
  csi:
    driver: efs.csi.aws.com
    volumeHandle: fs-073d77123471b2917

As shown in the previous pv.yaml file, we mentioned the spec.capacity.storage size as 5 Gibibytes. This is just a placeholder value to make Kubernetes happy because it is needed when creating a PV. We are using the Amazon EFS file system in the backend, which is a fully elastic and scalable file system, and so do not have to worry about the capacity as it automatically scales up or down based on usage.

$ kubectl apply -f pv.yaml 
persistentvolume/efs-pv created

$ kubectl get pv efs-pv
NAME     CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS   REASON   AGE
efs-pv   5Gi        RWO            Retain           Available                                   45s

The PV status is Available, but it is not yet bound with any PVC. Next, we create the persistent volume claim (PVC):

#pvc.yaml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: efs-claim
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: ""
  resources:
    requests:
      storage: 5Gi
$ kubectl apply -f pvc.yaml 
persistentvolumeclaim/efs-claim created

Now let’s check the status of the PV and PVC:

$ kubectl get pv efs-pv                           
NAME     CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM               STORAGECLASS   REASON   AGE
efs-pv   5Gi        RWO            Retain           Bound    default/efs-claim                           15m

$ kubectl get pvc efs-claim
NAME        STATUS   VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
efs-claim   Bound    efs-pv   5Gi        RWO                           103s

The PV status has now changed from Available to Bound, which means that Kubernetes has been able to find a volume match using the PVC, and the volume has been bound.

Now if we tried to create another PVC, it would fail because we don’t have any more PVs left (a PV can be bound to a single PVC) – that is where dynamic provisioning would come in handy. Before moving to that, let’s create a sample app (efs-app) with this PVC to demonstrate how the data is persisted:

#pod.yaml
---
apiVersion: v1
kind: Pod
metadata:
  name: efs-app
spec:
  containers:
  - name: app
    image: centos
    command: ["/bin/sh"]
    args: ["-c", "while true; do echo $(date -u) >> /data/out.txt; sleep 2; done"]
    volumeMounts:
    - name: persistent-storage
      mountPath: /data
  volumes:
  - name: persistent-storage
    persistentVolumeClaim:
      claimName: efs-claim
$ kubectl apply -f pod.yaml 
pod/efs-app created

You can verify that data is written onto the Amazon EFS filesystem using kubectl.

$  kubectl exec -ti efs-app -- tail -f /data/out.txt
Mon Mar 21 23:33:05 UTC 2022
Mon Mar 21 23:33:07 UTC 2022
Mon Mar 21 23:33:09 UTC 2022

Here is the summary for Static Provisioning.

Static Provisioning

Dynamic provisioning using Amazon EFS

Amazon EFS CSI driver supports both dynamic provisioning and static provisioning. For EFS, dynamic provisioning creates an access point for each PV under the hood. This means you have to create an Amazon EFS file system manually and provide it as an input to the Storage Class parameters.

By default each access point created via dynamic provisioning writes files under a different directory on EFS, and each access point writes files to EFS using a different POSIX uid/gid. This enables multiple applications to use the same EFS volume for persistent storage while providing isolation between applications.

Now, let’s create a new Amazon EFS volume (myEFS2) that we will be using for dynamic provisioning.

EFS File System

Next, we need to create a Storage Class and provide the FileSystemId of the newly created file system:

#sc.yaml
----
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: efs-sc
provisioner: efs.csi.aws.com
parameters:
  provisioningMode: efs-ap
  fileSystemId: fs-026bb4e33bea77857
  directoryPerms: "700"

Create and verify the Storage Class:

$ kubectl apply -f sc.yaml 
storageclass.storage.k8s.io/efs-sc created

$ kubectl get sc
NAME            PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
efs-sc          efs.csi.aws.com         Delete          Immediate              false                  4s
gp2 (default)   kubernetes.io/aws-ebs   Delete          WaitForFirstConsumer   false                  2d5h

As mentioned in the “Dynamic provisioning” section, we don’t have to create any PV before deploying our application. So, you can go ahead and create a PVC and Pod:

#pvc_pod_1.yaml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: efs-claim-1
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: efs-sc
  resources:
    requests:
      storage: 5Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: efs-app-1
spec:
  containers:
    - name: app
      image: centos
      command: ["/bin/sh"]
      args: ["-c", "while true; do echo $(date -u) >> /data/out; sleep 5; done"]
      volumeMounts:
        - name: persistent-storage
          mountPath: /data
  volumes:
    - name: persistent-storage
      persistentVolumeClaim:
        claimName: efs-claim-1

Let’s deploy the PVC and the Pod :

$ kubectl apply -f pvc_pod_1.yaml 
persistentvolumeclaim/efs-claim-1 created
pod/efs-app-1 created

$ kubectl get pvc
NAME          STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
efs-claim-1   Bound    pvc-82da05b5-ff85-40a5-8135-50428480fd22   5Gi        RWX            efs-sc         89s
 
$ kubectl get pv | grep efs-sc
pvc-7994fdce-5711-4346-aefb-85e10155ec7c   5Gi        RWX            Delete  

$ kubectl get pods
NAME        READY   STATUS    RESTARTS   AGE
efs-app-1   1/1     Running   0          95s

$ kubectl exec -ti efs-app-1 -- tail -f /data/out
Tue Mar 22 00:38:08 UTC 2022
Tue Mar 22 00:38:13 UTC 2022
Tue Mar 22 00:38:18 UTC 2022

As you can see now, the efs-app-1 is running successfully, and the PV got automatically created by the EFS CSI driver. You can see its respective access point in the AWS Management Console.

EFS File System Access Point

If you repeat the process and create another application (efs-app-2), we don’t have to worry about the PV because it will create another access point under the hood.

Here is a summary for Dynamic Provisioning:

Cleaning up

Once you are done with all the exceises, make sure you delete all the resources.

  1. Delete all the Kubernetes resources (PV, PVC, pod, storage class, etc.)
$ kubectl delete -f pod.yaml 
$ kubectl delete -f pvc.yaml 
$ kubectl delete -f pv.yaml
$ kubectl delete -f pvc_pod_1.yaml
$ kubectl delete -f sc.yaml
  1. Delete the EFS file system (myEFS1 and myEFS2) via UI or CLI.
  2. Delete the EKS cluster which you created at the beginning using this GitHub page.

Conclusion

In this blog post, we covered some basic principles of persistent storage for Kubernetes. Persistent Volumes enables you to create stateful applications on Kubernetes, in which data is persisted regardless of pod crashes or terminations. Persistent Volumes can be provisioned using either static or dynamic provisioning. We demonstrated both on an EKS cluster using Amazon EFS as underlying storage.

Now that we learned all the basics, it is time to move on to a more practical use case. That’s exactly what we are going to do in part 2 of this blog series, wherein we will make use of persistent storage to run a machine learning workload using Kubeflow.

Thanks for reading this blog post. For more tutorials on using EFS with containers, you can visit the amazon-efs-developer-zone Github repository. If you have any comments, questions, or feedback, leave a comment in the comments section below.

Suman Debnath

Suman Debnath

Suman Debnath is a Principal Developer Advocate (Data Engineering) at Amazon Web Services, primarily focusing on Data Engineering, Data Analysis and Machine Learning. He is passionate about large scale distributed systems and is a vivid fan of Python. His background is in storage performance and tool development, where he has developed various performance benchmarking and monitoring tools.

Daniel Rubinstein

Daniel Rubinstein

Daniel Rubinstein is a Software Development Engineer on the Amazon Elastic File System team. He is passionate about solving technology challenges, distributed systems, and storage. In his spare time, he enjoys outdoor activities and cooking.

Anjani Reddy

Anjani Reddy

Anjani is a Specialist Technical Account Manager at AWS. She works with Enterprise customers to provide operational guidance to innovate and build a secure, scalable cloud on the AWS platform. Outside of work, she is an Indian classical & salsa dancer, loves to travel and Volunteers for American Red Cross & Hands on Atlanta.

Narayana Vemburaj

Narayana Vemburaj

Narayana Vemburaj is a Senior Technical Account Manager at Amazon Web Services based in Atlanta, GA. He’s passionate about cloud technologies , assists Enterprise AWS customers in their cloud transformation journey. Outside of work he likes to spend time playing video games and watching science fiction movies.