[Supported in Kublr 1.20.0 and 1.20.1]

[Do not use in Kublr 1.20.2 and later]

Kublr 1.20.2+: note that this solution should not be used with Kublr KCP 1.20.2 and later, as Kublr 1.20.2 creates a NAT gateway by default.


TABLE OF CONTENTS


Time Synchronization Issue in Azure Clusters in Kublr 1.20.0 and 1.20.1


Kublr 1.20.0 and 1.20.1 migrated to using Standard SKU for load balancers by default. Standard LB in Azure provide more features and better SLA than Basic LB, with one drawback: Standard LB unlike Basic LB disable default SNAT for non-TCP traffic. This may cause various issues in the clusters, the most noticeable of which is non-functional NTP time synchronization. Time desynchronization may in turn lead to multitude of Kublr, Kubernetes, and application-level issues.


Kublr 1.20.2 fixes this issue by deploying a NAT Gateway by default in the clusters.


If it is necessary to stay on Kublr 1.20.0 or 1.20.1, this issue may be fixed by manually configuring a Public IP and a NAT Gateway as custom Azure resources in the Kublr cluster specification as follows:

spec:
  ...
  locations:
    - name: azure1
      azure:
        armTemplateResourcesExtra:
          - type: Microsoft.Network/publicIPAddresses 
            apiVersion: '2020-05-01'
            name: NAT-GW-PublicIP
            comments: This is the Public IP address for cluster NAT GateWay
            location: '[parameters(''region'')]'
            properties:
              idleTimeoutInMinutes: 10
              publicIPAddressVersion: IPv4
              publicIPAllocationMethod: Static
            sku:
              name: Standard

          - type: Microsoft.Network/natGateways 
            apiVersion: '2020-05-01'
            name: NAT-GW
            dependsOn:
              - '[resourceId(''Microsoft.Network/publicIPAddresses'', string(''NAT-GW-PublicIP''))]'
            location: '[parameters(''region'')]'
            properties:
              idleTimeoutInMinutes: 5
              publicIpAddresses:
                - id: '[resourceId(''Microsoft.Network/publicIPAddresses'', string(''NAT-GW-PublicIP''))]'
            sku:
              name: Standard

        armTemplateExtras:
          virtualNetwork:
            dependsOn:
              - '[resourceId(''Microsoft.Network/natGateways'', string(''NAT-GW''))]'
          subnet:
            properties:
              natGateway:
                id: '[resourceId(''Microsoft.Network/natGateways'', string(''NAT-GW''))]'


Upgrading to Kublr 1.20.2+


Before upgrading to Kublr 1.20.2+ the NAT Gateway and NAT Gateway IP resources should be removed from the cluster spec, the cluster updated.

After this verify that the resources were removed from the Azure resource group and if necessary remove them manually.

The KCP cluster can then be updated in KCP 1.20.2+ to automatically provision NAT Gateway.