TABLE OF CONTENTS


Tags: aws, azure


Overview


Kublr Kubernetes clusters as well as Kublr Control Plane are designed to survive temporary shutdown or some nodes or the whole cluster.


After restarting or recreating the terminated nodes, the cluster will fully recover including all applications that were running in the cluster.


Note: some applications, especially those that are stateful, may require special pre-shutdown operations. Always test this procedure on non-critical environments first to ensure that custom applications running in the cluster can survive the shutdown and recover.


Some use-cases for cluster shutdown include:

  1. shutting down cloud clusters during non-active time periods to reduce cost
  2. demo environments


AWS


Every node in Kublr AWS cluster is running in a auto-scaling group (ASG), even when that auto-scaling group has a size of 1.


Thus to temporarily shut down an AWS Kublr Kubernetes cluster it is only necessary to set the min and the desired number of nodes in all ASG of the cluster to 0.


It can be achieved either by updating the ASG in the user portal, or using the following AWS CLI shell scripts:


# set environment variables to point at the cluster you want to shut down

REGION=us-east-1
CLUSTER=mycluster

# print all ASG names that the script will stop

aws autoscaling describe-auto-scaling-groups --region "${REGION}" --output table \
  --query 'AutoScalingGroups[?contains(Tags[?Key==`KubernetesCluster`].Value,
           `'${CLUSTER}'`)].AutoScalingGroupName'

# shutdown the cluster

aws autoscaling describe-auto-scaling-groups --region "${REGION}" --output text \
  --query 'AutoScalingGroups[?contains(Tags[?Key==`KubernetesCluster`].Value,
           `'${CLUSTER}'`)].[ join(`" "`, [`"aws --region '${REGION}' autoscaling"`,
           `"update-auto-scaling-group --auto-scaling-group-name"`,
           AutoScalingGroupName,`"--min-size 0 --desired-capacity 0"`])]' \
  | bash


To start the cluster back up you can either

1. go to the AWS console and set the ASG size back to normal manually, or

2. use Kublr UI and update the cluster without making any changes, or

3. use the following AWS CLI shell script to set the ASG desired size to the max size value:


# set environment variables to point at the cluster you want to start

REGION=us-east-1
CLUSTER=mycluster

# start the cluster up

aws autoscaling describe-auto-scaling-groups --region "${REGION}" --output text \
  --query 'AutoScalingGroups[?contains(Tags[?Key==`KubernetesCluster`].Value,
           `'${CLUSTER}'`)].[ join(`" "`, [`"aws --region '${REGION}' autoscaling"`,
           `"set-desired-capacity --auto-scaling-group-name"`,
           AutoScalingGroupName,`"--desired-capacity"`, to_string(MaxSize)])]' \
  | bash


Azure


Every node in Kublr Azure cluster is either an individual VM or a member of VM Scale Set (VMSS).


Thus to temporarily shut down an Azure Kublr Kubernetes cluster it is only necessary to stop (deallocate) all the cluster's VMs and VMSS's.


It can be achieved either by updating the VMs and the VMSSs in the Azure user portal, or using the following Azure CLI shell scripts:


# set environment variables to point at the cluster you want to shut down

SUBSCRIPTION=01234567-89ab-cdef-0123-456789abcdef
RESOURCE_GROUP=mycluster
CLUSTER=mycluster

# Print all VMs and VMSSs that will be stopped

az vm list --subscription "${SUBSCRIPTION}" --resource-group "${RESOURCE_GROUP}" \
  --query '[?starts_with(@.name, `'"${CLUSTER}"'`)].name' --output tsv

az vmss list --subscription "${SUBSCRIPTION}" --resource-group "${RESOURCE_GROUP}" \
  --query '[?starts_with(@.name, `'"${CLUSTER}"'`)].name' --output tsv

# Stop all VMs and VMSSs of the cluster

az vm deallocate --ids $(az vm list --subscription "${SUBSCRIPTION}" \
  --resource-group "${RESOURCE_GROUP}" \
  --query '[?starts_with(@.name, `'"${CLUSTER}"'`)].id' --output tsv)

for n in $(az vmss list --subscription "${SUBSCRIPTION}" \
  --resource-group "${RESOURCE_GROUP}" \
  --query '[?starts_with(@.name, `'"${CLUSTER}"'`)].name' --output tsv) ; do
  az vmss deallocate --subscription "${SUBSCRIPTION}" \
    --resource-group "${RESOURCE_GROUP}" --name "${n}"
done


To start the cluster back up you can either

1. go to the Azure portal and start the VMs and VMSSs manually, or

2. use the following Azure CLI shell script to start the VMs and VMSSs:


# set environment variables to point at the cluster you want to start

SUBSCRIPTION=01234567-89ab-cdef-0123-456789abcdef
RESOURCE_GROUP=mycluster
CLUSTER=mycluster

# Start all VMs and VMSSs of the cluster

az vm start --ids $(az vm list --subscription "${SUBSCRIPTION}" \
  --resource-group "${RESOURCE_GROUP}" \
  --query '[?starts_with(@.name, `'"${CLUSTER}"'`)].id' --output tsv)

for n in $(az vmss list --subscription "${SUBSCRIPTION}" \
  --resource-group "${RESOURCE_GROUP}" \
  --query '[?starts_with(@.name, `'"${CLUSTER}"'`)].name' --output tsv) ; do
  az vmss start --subscription "${SUBSCRIPTION}" \
    --resource-group "${RESOURCE_GROUP}" --name "${n}"
done


You can replace "deallocate" with "stop" in the script above to just stop VMs and VMSSs without deallocating (they will continue to be billed).