TABLE OF CONTENTS


Tags: persistence


Many Kubernetes cloud providers support dynamic volume expansion, which allows increasing the size of volumes when space runs out.


This article explains persistent volume expansion procedures and planning as it relates to Kublr components. At the same time these procedures can be useful for other applications volume expansion.


Dynamic volume expansion support in Kubernetes is described in more details in Kubernetes documentation:


Volume Resizing Procedure


1. Identify a persistent volume claim (PVC) that needs to be expanded.

We will use Kublr Elasticsearch data volume as an example:

kubectl get pvc -n kublr data-kublr-logging-elasticsearch-data-0 -o yaml

Note that the status of the PVC is good without conditions set.


2. In many cases an attached block volume cannot be resized dynamically and requires pods using the volume to be stopped (a notable exception is AWS EBS).

In this case scale corresponding Stateful Set or Deployment down to zero so that Kubernetes can detach and resize the used volumes.

kubectl scale -n kublr statefulset/kublr-logging-elasticsearch-data --replicas=0


3. Patch the PVC increasing its size

kubectl patch pvc -n kublr data-kublr-logging-elasticsearch-data-0 \
  -p '{ "spec": { "resources": { "requests": { "storage": "180Gi" }}}}'

When resizing is started you will see corresponding condition in the PVC status:

kubectl get pvc -n kublr data-kublr-logging-elasticsearch-data-0 -o yaml

...
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 151Gi
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2021-07-02T15:23:56Z"
    status: "True"
    type: Resizing
  phase: Bound

and when resizing of the volume is complete, the status will change correspondingly:

kubectl get pvc -n kublr data-kublr-logging-elasticsearch-data-0 -o yaml

...
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 151Gi
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2021-07-02T15:35:05Z"
    message: Waiting for user to (re-)start a pod to finish file system resize of
      volume on node.
    status: "True"
    type: FileSystemResizePending
  phase: Bound


4. Now you can scale the Stateful Set or Deployment back up

kubectl scale -n kublr statefulset/kublr-logging-elasticsearch-data --replicas=1


5. When pod(s) is(are) restarted and the volume resize is completed, Kubernetes will reflect that in the PVC status by changing the actual size and removing the condition

kubectl get pvc -n kublr data-kublr-logging-elasticsearch-data-0 -o yaml

...
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 180Gi


Recovering from failure when expanding volumes


Follow the steps described in Kubernetes documentation:


If expanding underlying storage fails, the cluster administrator can manually recover the Persistent Volume Claim (PVC) state and cancel the resize requests. Otherwise, the resize requests are continuously retried by the controller without administrator intervention.
  1. Mark the PersistentVolume(PV) that is bound to the PersistentVolumeClaim(PVC) with Retain reclaim policy.
  2. Delete the PVC. Since PV has Retain reclaim policy - we will not lose any data when we recreate the PVC.
  3. Delete the claimRef entry from PV specs, so as new PVC can bind to it. This should make the PV Available.
  4. Re-create the PVC with smaller size than PV and set volumeName field of the PVC to the name of the PV. This should bind new PVC to existing PV.
  5. Don't forget to restore the reclaim policy of the PV.