[Supported in Kublr 1.20.0 and 1.20.1]
[Do not use in Kublr 1.20.2 and later]
Kublr 1.20.2+: note that this solution should not be used with Kublr KCP 1.20.2 and later, as Kublr 1.20.2 creates a NAT gateway by default.
TABLE OF CONTENTS
Time Synchronization Issue in Azure Clusters in Kublr 1.20.0 and 1.20.1
Kublr 1.20.0 and 1.20.1 migrated to using Standard SKU for load balancers by default. Standard LB in Azure provide more features and better SLA than Basic LB, with one drawback: Standard LB unlike Basic LB disable default SNAT for non-TCP traffic. This may cause various issues in the clusters, the most noticeable of which is non-functional NTP time synchronization. Time desynchronization may in turn lead to multitude of Kublr, Kubernetes, and application-level issues.
Kublr 1.20.2 fixes this issue by deploying a NAT Gateway by default in the clusters.
If it is necessary to stay on Kublr 1.20.0 or 1.20.1, this issue may be fixed by manually configuring a Public IP and a NAT Gateway as custom Azure resources in the Kublr cluster specification as follows:
spec: ... locations: - name: azure1 azure: armTemplateResourcesExtra: - type: Microsoft.Network/publicIPAddresses apiVersion: '2020-05-01' name: NAT-GW-PublicIP comments: This is the Public IP address for cluster NAT GateWay location: '[parameters(''region'')]' properties: idleTimeoutInMinutes: 10 publicIPAddressVersion: IPv4 publicIPAllocationMethod: Static sku: name: Standard - type: Microsoft.Network/natGateways apiVersion: '2020-05-01' name: NAT-GW dependsOn: - '[resourceId(''Microsoft.Network/publicIPAddresses'', string(''NAT-GW-PublicIP''))]' location: '[parameters(''region'')]' properties: idleTimeoutInMinutes: 5 publicIpAddresses: - id: '[resourceId(''Microsoft.Network/publicIPAddresses'', string(''NAT-GW-PublicIP''))]' sku: name: Standard armTemplateExtras: virtualNetwork: dependsOn: - '[resourceId(''Microsoft.Network/natGateways'', string(''NAT-GW''))]' subnet: properties: natGateway: id: '[resourceId(''Microsoft.Network/natGateways'', string(''NAT-GW''))]'
Upgrading to Kublr 1.20.2+
Before upgrading to Kublr 1.20.2+ the NAT Gateway and NAT Gateway IP resources should be removed from the cluster spec, the cluster updated.
After this verify that the resources were removed from the Azure resource group and if necessary remove them manually.
The KCP cluster can then be updated in KCP 1.20.2+ to automatically provision NAT Gateway.