Troubleshooting
As of September 2024 VMWare/Broadcom deleted image registry on gcr.io and migrated to registry.k8s.io for all CSI driver versions 3.1+.
Most importantly, images for CSI driver versions 3.0 and below were abandoned and not migrated to the new registry.
This has a potential to disrupt existing clusters that use older versions of CSI driver. In particular, Kublr Kubernetes clusters created by Kublr versions older than 1.29 use the deprecated image registry by default.
Users running vSphere clusters created by Kublr versions before 1.29 may see vSphere CSI driver pods crashing due to missing image.
There are two options to mitigate the issue:
- upgrade Kublr agent version to a newer version used in Kublr 1.29.0 or later (refer to Kublr release notes to identify specific agent versions: https://docs.kublr.com/releasenotes/)
- override vSphere CSI driver images in the cluster specification to use the images backed up in Kublr official image registry as shown in the following example.
NB! Make sure that you use correct image tags checking the failing pods specification, and verify that the images are available in the Kublr registry.
spec: kublrAgentConfig: kublr: docker_image: vsphere_csi_driver: cr.kublr.com/cloud-provider-vsphere/csi/release/driver:v3.0.3 vsphere_csi_syncer: cr.kublr.com/cloud-provider-vsphere/csi/release/syncer:v3.0.3
References
- https://github.com/kubernetes-sigs/vsphere-csi-driver/issues/3053
- https://github.com/kubernetes-sigs/vsphere-csi-driver/issues/3023
- https://github.com/kubernetes-sigs/vsphere-csi-driver/issues/3087
- https://github.com/kubernetes-sigs/vsphere-csi-driver?tab=readme-ov-file#vsphere-csi-driver-releases
- https://docs.kublr.com/releasenotes/