What is included in this blog:
- An introduction of Persistent Volumes and Persistent Volume Claims.
- A Discussion about how to update Persistent Volumes and Persistent Volume Claims.
What Are Persistent Volumes and Persistent Volume Claims
The following picture shows the overview of PVs and PVCs.
From the picture you can see that:
- PVs are created by cluster administrators and they are consumed by PVCs which are created by developers.
- A PV is like mount configuration to a storage. Therefore, you can create different mount configurations for the same storage by creating multiple PVs.
- A PV is a public resource in a cluster, which means it is accessible to all the namespace. This also means the name of the PV needs to be unique in the whole cluster.
- A PVC is a k8s object within a namespace, which means its name must be unique in the namespace.
- A PV can only be exclusively bound to a PVC. This one-to-one mapping lasts until the PVC is deleted.
There are two ways to provision PVs: statically or dynamically.
A static PV is a PV manually created by a cluster administrator with details of a storage. “Static” here means the PV must exist before being consumed by a PVC.
Here is an example of static PVs:
apiVersion: v1 kind: PersistentVolume metadata: name: nfs-pv spec: nfs: # TODO: use right IP server: 126.96.36.199 path: "/data" readOnly: false mountOptions: - vers=4.0 - rsize=32768 - wsize=32768 capacity: storage: 10Gi accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: nfs-pvc spec: resources: requests: storage: 10Gi accessModes: - ReadWriteMany selector: matchLabels: pv-name: nfs-pv
apiVersion: v1 kind: Pod metadata: name: nfs-pod labels: app: nfs-pods spec: volumes: - name: data-dir persistentVolumeClaim: claimName: nfs-pvc containers: - name: nginx image: nginx volumeMounts: - name: data-dir mountPath: "/usr/share/nginx/html" readOnly: false
In this example, the
ngx-pvc PVC finds and binds the
nfs-pv PV via LabelSelector. Then the
nfs-pod Pod utilizes the
nfs-pvc PVC to create a volume called
data-dir and then mounts the volume to the
/usr/share/nginx/html directory in the
nginx container. After the creation of these Kubernetes objects, ssh into the
nginx container in the
nfs-pod Pod, run
cat /proc/mount then you will find the information of the
nfs-pv PV, like:
188.8.131.52:/data /usr/share/nginx/html nfs4 vers=4.0,rsize=32768,wsize=32768,...,addr=184.108.40.206 0 0. This means the NFS server
220.127.116.11:/data in the
nfs-pv PV is mounted to the
PV Types and Mount Options
Kubernetes currently supports a lot of PV types, for example NFS, CephFS, Glusterfs and GCEPersistentDisk. You can check this doc for more details.
In this example, the
nfs-pv PV is created using NFS PV type, with server
18.104.22.168 and the path
data. In addition, this PV also specifies some other mount options for connecting to the NFS server. Mount options are only supported by some PV types, you can check this doc for more details.
The capacity of A PV
The capacity of a static PV is not hard limit of the corresponding storage. Instead, the capacity is fully controlled by the real storage. Therefore, suppose the NFS server in the example has 200Gi storage space, the
nfs-pv PV is able to use up all of the NFS server’s space even although it only has
capacity.storage == 10Gi. The capacity setting of a static PV normally is just for matching up the storage request in the corresponding PVC.
There are three access modes for PVs:
ReadWriteOnce: a PV can be mounted as read-write by a single node if it has
accessModesspec. This means 1. the PV can perform read and write operation to a storage. 2. The PV can only be mounted on a single node, which means any Pod that wants to use this PV must be scheduled to the same node as well.
ReadOnlyMany: a PV can be mounted as read-only by many nodes if it has
ReadOnlyManyallows the PV to be mounted on many nodes but it can only perform read operation to the real storage. Any write request will be denied in this case.
ReadWriteMany: a PV can be mounted as read-write by many nodes if it has
accessModesspec. This means the PV can perform read and write operation in many nodes.
Different PV types have different supports for these three access modes. You can check this doc for more details.
You may notice that the PV’s
accessModes field is an array, which means it can has multiple access modes. Nevertheless, a PV can only be mounted using one access mode at a time, even if it has multiple access mods in its
accessModes field. Therefore, instead of including multiple access modes in a PV, it is recommended to have one access mode in one PV and create separate PVs with different access modes for different use cases.
You may also notice that there are other attributes which can also affect access modes. Here is simplified summary:
readOnlyattribute of a PV type is storage side setting. It is used to control whether real storage is read-only or not.
AccessModesof a PV is PV side setting and it is used to control access mode of the PV.
AccessModesof a PVC has to match up the PV that it wants to bind. A PV and a PVC build a bridge between the “client” and the real storage: the PV connects to the real storage while thr PVC connects to the “client”.
readOnlyattribute of VolumeMount is “client” side setting. It is used to control whether the mounted directory is read-only or not.
persistentVolumeReclaimPolicy field specifies the reclaim policy for a PV, which can be either
Delete (default value) or
Retain. You may want to set it
Retain for a PV and back up the data if the data inside the storage that the PV corresponds to is really important.
The example above uses LabelSelector
matchLabels.pv-name == pv-name to bind the
nfs-pv PV and the
nfs-pvc PVC together. You do not need to use LabelSelector to establish the bind between PVs and PVCs if you want more flexible way of binding. For example, without LabelSelector, a PVC that requires
storage == 10Gi and
accessModes == [ReadWriteOnce] can be bound to a PV with
storage >= 10Gi and
accessModes == [ReadWriteOnce, ReadWriteMany].
“Dynamic” Persistent Volumes
Dynamic Persistent Volumes are the volumes dynamically created by K8s with the specification of a user’s PVC. The dynamic provisioning is based on Storage Classes: the PVC must specify an existing
StorageClass in order to create a dynamic PV.
StorageClass is a Kubernetes object used to describe a storage class. It uses the fields like
reclaimPolicy to describe details of the storage class that it represents. Let’s take a look at the GKE’s default storage class
standard, here is its spec:
kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: standard parameters: type: pd-standard provisioner: kubernetes.io/gce-pd reclaimPolicy: Delete volumeBindingMode: Immediate
metadata.namefield is the name of the
StorageClass. It has to be unique in the whole cluster.
parametersfield specifies the parameters for the real storage. For example
parameters.type == pd-standardmeans this storage class uses GCEPersistentDisk as storage media. You can check this doc for more details about the parameters of Storage Classes.
provisionerfield specifies which volume plugin is used for provisioning dynamic PVs for the Storage Class. You can check this list for each provisioner’s specification.
reclaimPolicyfield specifies the reclaim policy for the storage created by the Storage Class, which can be either
Delete(default value) or
volumeBindingModefield controls when to perform dynamic provisioning and volume binding.
volumeBindingMode == Immediatemeans doing dynamic provisioning and volume binding once the PVC is created, while
volumeBindingMode == WaitForFirstConsumermeans delaying dynamic provisioning and volume binding until the PVC is actually being consumed.
A Use Case
This example utilizes dynamic provisioning to create storage resources for a ZooKeeper service. (Here I simplify the the config for the demo purpose. You can check this doc for more details about how to setup a ZooKeeper Service with a StatefulSet in Kubernetes.)
apiVersion: apps/v1 kind: StatefulSet metadata: name: zoo-keepr # StatefulSet spec spec: serviceName: zk-hs selector: matchLabels: app: zk replicas: 3 volumeClaimTemplates: - metadata: name: datadir spec: storageClassName: standard accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 10Gi # Pod spec template: metadata: labels: app: zk spec: containers: - name: k8szk image: gcr.io/google_samples/k8szk:v3 ... volumeMounts: - name: datadir mountPath: /var/lib/zookeeper
In this example, the
volumeClaimTemplates field is used to do dynamic provisioning: A PVC is created with the specification defined in the
volumeClaimTemplates field for each Pod. Then A PV is created by the
standard Storage Class for each Pod and bound to a PVC created in last steps. Then A 10Gi GCEPersistentDisk is created by the
standard Storage Class for each PV. The PVC has the same
reclaimPolicy with the PV.
“Updating” PVs & PVCs
Updating Static PVs
Sometimes you need to update some parameters, for example, mount options, for a static PV which the storage is not dynamically provisioned. However, updating a PV that is being used may be blocked by K8s. But as what I mentioned above, A PV and PVC bound is like building a bridge between clients and the real storage. Therefore, instead of updating the existing PV, you can create a new PV with the new settings you want for updating purpose.
Updating Dynamic PVs
A dynamic PV now can be extended (shrinking is not supported) by editing its bound PVC in Kubernetes v1.11 or later versions. This feature is well supported in many built-in volume providers, such as GCE-PD, AWS-EBS and GlusterFs. An cluster administrator can make this feature available for cluster users by setting
allowVolumeExpansion == true in the configurations of the Storage Classes. You can check this blog for more details.