In this, the second part of a 3-part series, I cover persistent storage. Part 1 covered traditional ephemeral storage.
Persistent Storage
Persistent storage as the name implies is storage that can maintain the state of the data it holds over the failure and restart of an application, regardless of the worker node on which the application is running. It is also possible with persistent storage to keep the data used or created by an application after the application has been deleted. This is useful if you need to reutilize the data in another application, or as enable the application to restart in the future and still have the latest dataset available. You can also leverage persistent storage to allow for disaster recovery or business continuity copies of the dataset.
StorageClass
A construct in Kubernetes that has to be understood for storage is the StorageClass. A StorageClass provides a way for administrators to describe the “classes” of storage they offer. Different classes might map to quality-of-service levels, or different access rules, or any arbitrary policies determined by the cluster administrators.
Each CSI storage driver will have a unique provisioner that is assigned as an attribute to a storage class and instructs any persistent volumes associated with that storage class to use the named provisioner, or CSI driver when provisioning the underlying volume on the storage platform.
Provisioning
Obtaining persistent storage for a pod is a three-step process:
- Define a PersistentVolume (PV), which is the disk space available for use
- Define a PersistentVolumeClaim (PVC), which claims usage of part or all of the PersistentVolume disk space
- Create a pod that references the PersistentVolumeClaim
In modern-day CSI drivers, the first two steps are usually combined into a single task and this is referred to as dynamic provisioning. Here the PersistentVolumeClaim is 100% if the PersistentVolume and the volume will be formatted with a filesystem on first attachment to a pod.
Manual provisioning can also be used with some CSI drivers to import existing volumes on storage devices into the control of Kubernetes by converting the existing volume into a PersistentVolume. In this case, the existing filesystem on the original volume is kept with all existing data when first mounted to the pod. An extension of this is the ability to import a snapshot of an existing volume, thereby creating a full read-write clone of the source volume the snapshot derived from.
When a PV is created it is assigned a storageClassName
attribute and this class name controls many attributes of the PV as mentioned earlier. Note that the storageClassName
attribute ensures the use of this volume to only the PVCs that request the equivalent StorageClass. In the case of dynamic provisioning, this is all managed automatically and the application only needs to call the required StorageClass the PVC wants storage from and the volume is created and then bound to a claim.
When the application is complete or is deleted, depending on the way the PV was initially created, the underlying volume construct can either be deleted or retained for use by another application, or a restart of the original application. This is controlled by the reclaimPolicy
in the storageClass definition. In dynamic provisioning the normal setting for this is delete
, meaning that when the PVC is deleted the associated PV is deleted and the underlying storage volume is also deleted.
By setting the reclaimPolicy to retain
this allows for manual reclamation of the PV.
On deletion of the PVC, the associated PV is not deleted and can be reused by another PVC with the same name as the original PVC. This is the only PVC that can access the PV and this concept is used a lot with StatefulSets.
It should be noted that when a PV is retained a subsequent deletion of the PV will result in the underlying storage volume NOT being deleted, so it is essential that a simple way to ensure orphaned volumes do not adversely affect your underlying storage platforms capacity.
At this point, I’d like to mention Pure Service Orchestrator eXplorer which is an Open Source project to provide a single plane of glass for storage and Kubernetes administrator to visualize how Pure Service Orchestrator, the CSI driver provided by Pure Storage, is utilising storage. One of the features of PSOX is its ability to identify orphaned volumes from a Kubernetes cluster.
Persistent Volume Granularity
There are a lot of options available when it comes to how the pod can access the persistent storage volume and these are controlled by Kubernetes. These different options are normally defined with a storageClass.
The most common of these is the accessMode
which controls how the data in the PV can be accessed and modified. There are three modes available in Kubernetes:
- ReadWriteMany (RWX) – the volume can be mounted as read-write by many nodes
- ReadWriteOnce (RWO) – the volume can be mounted as read-write by a single node
- ReadOnlyMany (ROX) – the volume can be mounted read-only by many nodes
Additional controls for the underlying storage volume can be provided through the storageClass include mount options, volume expansion, binding mode which is usually used in conjunction with storage topology (also managed through the storageClass).
A storageClass can also apply specific, non-standard, granularity for different features a CSI driver can support.
In the case of Pure Service Orchestrator, all of the above-mentioned options are available to an administrator creating storage classes, plus a number of the non-standard features.
Here is an example of a storageClass definition configured to use Pure Service Orchestrator as the CSI provisioner:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: example
provisioner: pure-csi
parameters:
iops_limit: "30000"
bandwidth_limit: "10G"
backend: block
csi.storage.k8s.io/fstype: xfs
createoptions: -q
mountOptions:
- discard
allowedTopologies:
- matchLabelExpressions:
- key: topology.purestorage.com/rack
values:
- rack-0
- rack-1
allowedVolumeExpansion: true
This might look a little complex, but simplistically this example ensures that PersistentVolumes created through this storageClass will have the following attributes:
- Quality of Service limits of 10Gb/s bandwidth and 30k IOPs
- Volumes are capable of being expended in size
- One first use by a pod the volume will be formatted with the xfs filesystem and mounted with the discard flag
- The volume will only be created by an underlying FlashArray found in either rack-0 or rack-1 (based on labels defined in the PSO configuration file)
Pure Service Orchestrator even allows the parameters setting to control the NFS ExportRules of PersistentVolumes created on a FlashBlade.
Check back for Part 3 of this series, where I’ll discuss the latest developments in ephemeral storage in Kubernetes.