Tags: aws, security, network


TABLE OF CONTENTS


Overview


It is often a requirement for a global company to build and operate a distributed IT infrastructure to enable geographically distributed teams and IT components to stay connected and operate as a single entity across a Wide Area Network (WAN) in a secure, reliable and easy to manage way.


Even smaller and more localized businesses can benefit from IT infrastructure distributed across different regions, clouds and data centers for a multitude of reasons - from reliability, high availability, backup and disaster recovery, to security, and latency and performance optimization.


Cloud providers provide technical solutions enabling such distributed network architectures for a long time, and AWS is no exception with services ranging from AWS VPC Peering to AWS Transit Gateway.


AWS Transit Gateway is arguable one of the most flexible tools for building an enterprise WAN.
AWS Transit Gateway is essentially a software defined distributed routing solution that allows to securily and reliably connect Amazon VPCs, Site-to-Site VPN connections, and Direct Connect Gateways within single region as well as across regions (with recently introduced inter-region peering).


This article describes how Kublr and AWS Transit Gateway can be used to setup a distributed in which Kublr provisions and manages Kubernetes clusters can communicate that are automatically connected to a global private WAN implemented via AWS Transit Gateway.


Initial setup


For this article, as an example, we assume that the company is building a WAN architecture with AWS Transit Gateway using the approach with centralized outbound routing to the internet.


We assume that the enterprise IT manages the global WAN architecture and components and have already provided the following elements of an enterprise global WAN:

  1. AWS Transit Gateway, or multiple Transit Gateways in different regions with gateway peering established for inter-region connectivity
  2. Subnet address ranges for various purposes, from which address ranges are assigned for individual VPCs. For example, the enterprise IT team may pick 172.16.0.0/12 as their main private CIDR; then  assign CIDRs 172.16.0.0/16 and 172.18.0.0/16 for VPCs in regions us-east-1 and us-west-2 correspondingly; and the CIDR 172.19.0.0/16 - for the east coast data center use.
  3. Outbound internet routing VPC in each region used to connect the WAN to the public network (if needed)
  4. Direct Connect Gateways or inbound VPN servers to allow used to connect to the private WAN.


See Appendix I for instructions on configuring a test environment like this to test Kublr deploying and managing Kubernetes clusters connected to the enterprise WAN.


The following sections provide guidance on using Kublr with such a private WAN.


Create Kublr Kubernetes Cluster in the private WAN


Kublr always creates an AWS Kubernetes cluster in a VPC but there are two options for the VPC ownership.

The cluster VPC can be created by Kublr together with the subnets, routing tables and other AWS resources that are required.

Alternatively Kublr can create a cluster using an existing VPC and network infrastructure.

These two approaches may be used together or interchangeably depending on how ownership of different elements of infrastructure is distributed in the organization.


The following sections describe both of these options.


Option 1: Deploy cluster in an existing VPC


A. Create a VPC and subnets for the cluster connected to the TGW WAN


1. Create a VPC for the cluster using AWS CLI or Console


Name: main-cluster1-vpc

IPv4 CIDR block: 172.16.0.0/16


2. Create subnet(s):


Name: main-cluster1-subnet-a

Availability Zone: us-east-1a

IPv4 CIDR block: 172.16.0.0/20

Tag: "kubernetes.io/role/internal-elb": "1"


Name: main-cluster1-subnet-c

Availability Zone: us-east-1c

IPv4 CIDR block: 172.16.32.0/20

Tag: "kubernetes.io/role/internal-elb": "1"


3. Create a TGW Attachment


Name: main-cluster1-tgw-attachment

Transit Gateway ID: main-tgw

Attachment Type: VPC

DNS Support: on

IPv6 Support: off

VPC ID: main-cluster1-vpc

Subnet IDs: main-cluster1-subnet-a, main-cluster1-subnet-c


4. Add TGW route to the default routing table for main-cluster1-vpc


0.0.0.0/0 -> main-cluster1-tgw-attachment


B. Create Kublr Kubernetes cluster in main-cluster1-vpc


Note the following elements of the cluster specification:

  • spec.locations[0].aws.vpcId - must be set to the ID of main-cluster1-vpc VPC
  • spec.locations[0].aws.vpcCidrBlock - must be set to the same IPv4 range as the main-cluster1-vpc VPC
  • spec.master.locations[0].aws.availabilityZones and spec.nodes[*].locations[0].aws.availabilityZones - must be set according to availability zones in which corresponding instance groups should create instances
  • spec.master.locations[0].aws.subnetIds and spec.nodes[*].locations[0].aws.subnetIds - must be set to IDs of the subnets in corresponding availability zones in in which the instance groups should create the instances

Other parameters specified in the cluster specification snippet make sure that the EC2 instances are correctly configured to run in the private WAN environment - disabling elastic IP and public IP assignment, and setting master and ingress load balancer types to internal NLB.


spec:

  locations:
    - aws:
        vpcId: <main-cluster1-vpc-id>
        vpcCidrBlock: 172.16.0.0/16 # the cluster VPC IPv4 CIDR
        skipPublicSubnetsForPrivateGroups: true
        natMode: none
        skipInternetGateway: true

  # master node groups configurations
  master:
    locations:
      - aws:
          availabilityZones:
            - us-east-1a
          subnetIds:
            - <subnet-id-az-us-east-1a>
          masterNlbAllocationPolicy: private
          eipAllocationPolicy: none
          nodeIpAllocationPolicy: private

  # worker node groups configurations
  nodes:
    - name: group1
      locations:
        - aws:
            availabilityZones:
              - us-east-1a
            subnetIds:
              - <subnet-id-az-us-east-1a>
            eipAllocationPolicy: none
            nodeIpAllocationPolicy: private

  # if ingress controller is enabled, the following settings
  # change the ingress controller load balancer type to internal NLB,
  # which makes it available inside the WAN
  features
    ingress:
      values:
        nginx-ingress:
          controller:
            service:
              annotations:
                service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
                service.beta.kubernetes.io/aws-load-balancer-internal: "true"


Option 2: Deploy cluster in Kublr-created VPC


With this option Kublr will create VPC and other resource automatically, and will automatically attach the VPC to the TGW.

No manually created AWS resources are needed with this option, but the cluster specification is a bit more complex due to included additional AWS resources.


Note the following elements of the cluster specification:

  • spec.locations[0].aws.vpcCidrBlock - must be set to the IPv4 CIDR assigned for this cluster withing the private WAN
  • spec.master.locations[0].aws.availabilityZones and spec.nodes[*].locations[0].aws.availabilityZones - must be set according to availability zones in which corresponding instance groups should create instances
  • spec.locations[0].aws.resourcesCloudFormationExtras.*.TransitGatewayId - must be set to the ID of the private WAN transit gateway
  • spec.locations[0].aws.resourcesCloudFormationExtras.TransitGatewayAttachment.Properties.SubnetIds - must be set to the list of references to the subnets created by Kublr for master and/or worker nodes; one subnet should be selected for each availability zone where the transit gateway attachment should have endponts. Refer to Kublr refrence documentation on auto-generated subnets for more information on Kublr cluster AWS network architecture and subnets naming.
  • spec.locations[0].aws.resourcesCloudFormationExtras.SubnetRTAssoc* - one route table association must be created for each subnet generated by Kublr in this cluster.
    One subnet will be create for each AZ used by master instance group, and one subnet - for each AZ used by all worker groups combined.
    Refer to Kublr refrence documentation on auto-generated subnets for more information on Kublr cluster AWS network architecture and subnets naming.

Other parameters specified in the cluster specification snippet make sure that the EC2 instances are correctly configured to run in the private WAN environment - disabling elastic IP and public IP assignment, and setting master and ingress load balancer types to internal NLB.


spec:

  locations:
    - aws:
        vpcCidrBlock: 172.16.0.0/16 # the cluster VPC IPv4 CIDR
        skipPublicSubnetsForPrivateGroups: true
        natMode: none
        skipInternetGateway: true

        # this section includes additional AWS resources connecting the cluster VPC to the TGW WAN
        resourcesCloudFormationExtras:

          # VPC-TGW attachment
          TransitGatewayVpcAttachment:
            Type: AWS::EC2::TransitGatewayVpcAttachment
            Properties:
              TransitGatewayId: <main-tgw-id> # ID of the TGW
              VpcId: { Ref: NewVpc }
              SubnetIds: 
                - { Ref: SubnetMasterPrivate0 }

          # VPC-TGW attachment
          TransitGatewayRoute:
            Type: AWS::EC2::Route
            Properties:
              DestinationCidrBlock: 0.0.0.0/0
              RouteTableId: { Ref: RouteTablePublic }
              TransitGatewayId: <main-tgw-id> # ID of the TGW
            DependsOn: TransitGatewayVpcAttachment

          # For each master AZ
          SubnetRTAssocMasterPrivate0:
            Type: AWS::EC2::SubnetRouteTableAssociation
            Properties:
              RouteTableId: { Ref: RouteTablePublic }
              SubnetId: { Ref: SubnetMasterPrivate0 }

          # For each worker AZ
          SubnetRTAssocNodePrivate0:
            Type: AWS::EC2::SubnetRouteTableAssociation
            Properties:
              RouteTableId: { Ref: RouteTablePublic }
              SubnetId: { Ref: SubnetNodePrivate0 }

  # master node groups configurations
  master:
    locations:
      - aws:
          availabilityZones:
            - us-east-1a
          masterNlbAllocationPolicy: private
          eipAllocationPolicy: none
          nodeIpAllocationPolicy: private

  # worker node groups configurations
  nodes:
    - name: group1
      locations:
        - aws:
            availabilityZones:
              - us-east-1a
            eipAllocationPolicy: none
            nodeIpAllocationPolicy: private

  # if ingress controller is enabled, the following settings
  # change the ingress controller load balancer type to internal NLB,
  # which makes it available inside the WAN
  features
    ingress:
      values:
        nginx-ingress:
          controller:
            service:
              annotations:
                service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
                service.beta.kubernetes.io/aws-load-balancer-internal: "true"


Reference and additional information



Appendix I: Private WAN with Transit Gateway and OpenVPN


This section describes how to set up a test/PoC environment with Transit Gateway connecting private VPC's and routing all outbound traffic through one dedicated VPC, enabling single point protection with, for example, AWS Firewall.


We will configure the following architecture:


TODO: diagram


1. Setup Transit Gateway and dedicated edge VPC


A. Create an AWS Transit Gateway (TGW) via AWS CLI or Console


You can select any name for the TGW, we will use main-tgw in this example.

If you are creating the TGW via console, leave all other settings with their default values: no ASN, enabled DNS and VPN ECMP support, enabled default route table association and propagation, and disabled multicast support.


B. Create a dedicated edge VPC


Any name can be used, we will use main-edge-vpc here.

IPv4 CIDR block: 172.20.0.0/20


B.1. Create a public subnet in the edge VPC


Name: main-edge-subnet-public-a

Availability zone: us-east-1a

IPv4 CIDR block: 172.20.0.0/24


B.2. Create a private subnet in the edge VPC


Name: main-edge-subnet-private-a

Availability zone: us-east-1a

IPv4 CIDR block: 172.20.8.0/24


B.3. Create a new Internet Gateway for the edge VPC and attach it


Name: main-edge-igw


Attach the Internet Gateway to main-edge-vpc VPC.


B.4. Create a new NAT Gateway for the edge VPC


Name: main-edge-natgw-a

Subnet: main-edge-subnet-public-a

Elastic IP allocation ID: allocate a new Elastic IP


B.5. Create a private route table in the edge VPC


Name: main-edge-private-rt-a

Associate with subnet: main-edge-subnet-private-a

Add route: 0.0.0.0/0 -> main-edge-natgw-a


B.6. Create a public route table in the edge VPC


Name: main-edge-public-rt

Associate with subnet: main-edge-subnet-public-a

Add route: 0.0.0.0/0 -> main-edge-igw


C. Attach the edge VPC to the transit gateway and configure WAN routing in the edge VPC


C.1. Create a new Transit Gateway Attachment


Name: main-edge-tgw-attachment

Transit Gateway ID: main-tgw

Attachment Type: VPC

DNS Support: on

IPv6 Support: off

VPC ID: main-edge-vpc

Subnet IDs: main-edge-subnet-private-a


C.2. Configure WAN routing in the edge VPC


1. Add route 172.16.0.0/12 -> main-edge-tgw-attachment in both edge routing tables main-edge-private-rt-a and main-edge-public-rt


2. Add a static route 0.0.0.0/0 -> main-edge-tgw-attachment on the transit gateway main-tgw


Now you have a WAN transit gateway with an edge VPC configured for public network access.

The transit gateway is ready to accept other VPC attachments.


2. Create OpenVPN server in edge VPC and configure VPN access to WAN


A. Launch a new EC2 instance in the public subnet of the edge VPC


AMI: Amazon Linux 2 AMI (HVM), SSD Volume Type

Instance Type: t3a.nano

Network: main-edge-vpc

Subnet: main-edge-subnet-public-a

Auto-assign Public IP: enable

Name: main-edge-vpn

Secuity group: open ports 22/tcp, 1194/tcp, and 1194/udp

Key pair: specify the SSH key that you can use to login into the instance


Disable source check on the EC2 instance main-edge-vpn.


Add route 10.8.0.0/24 -> main-edge-vpn to both public and private routing tables main-edge-private-rt-a and main-edge-public-rt.


B. Install and configure OpenVPN


SSH into the instance and install and configure OpenVPN server as follows:


# change to root

sudo -i

# update and install required packages

amazon-linux-extras install epel
yum update -y
yum install -y openvpn easy-rsa

# generate CA

mkdir ~/easy-rsa
cd ~/easy-rsa/
/usr/share/easy-rsa/3/easyrsa init-pki

# the following command will ask to enter password (e.g. "password") and common name (e.g. "openvpn")
/usr/share/easy-rsa/3/easyrsa build-ca

/usr/share/easy-rsa/3/easyrsa gen-dh

# the following command will ask to enter password (e.g. "password") and common name (e.g. "openvpnserver")
/usr/share/easy-rsa/3/easyrsa gen-req server

# the following command will ask to confirm ("yes"), enter password ("password")
/usr/share/easy-rsa/3/easyrsa sign-req server server

# move key/cert files to /etc/openvpn/server
cd
mv ~/easy-rsa /etc/openvpn/server

# remove password from the private key
cd /etc/openvpn/server/easy-rsa/pki/private/
mv server.key server.key.orig
# the following command will ask to enter password ("password")
openssl rsa -in server.key.orig -out server.key

Configure OpenVPN:


cd /etc/openvpn/server
openvpn --genkey --secret ta.key
mv ta.key ./easy-rsa/pki/private/

cp /usr/share/doc/openvpn-*/sample/sample-config-files/server.conf /etc/openvpn/server/

# edit OpenVPN server configuration file
vim /etc/openvpn/server/server.conf


Set the following parameters in the configuration file:


ca /etc/openvpn/server/easy-rsa/pki/ca.crt
cert /etc/openvpn/server/easy-rsa/pki/issued/server.crt
key /etc/openvpn/server/easy-rsa/pki/private/server.key
dh /etc/openvpn/server/easy-rsa/pki/dh.pem
tls-auth /etc/openvpn/server/easy-rsa/pki/private/ta.key 0
# here configure all the subnets routes to which you want the server to push to the clients
push "route 172.16.0.0 255.255.0.0"
push "route 172.18.0.0 255.255.0.0"
push "route 172.20.0.0 255.255.0.0"


As a result the server config file will look as follows:


port 1194
proto udp
dev tun
ca /etc/openvpn/server/easy-rsa/pki/ca.crt
cert /etc/openvpn/server/easy-rsa/pki/issued/server.crt
key /etc/openvpn/server/easy-rsa/pki/private/server.key
dh /etc/openvpn/server/easy-rsa/pki/dh.pem
server 10.8.0.0 255.255.255.0
ifconfig-pool-persist ipp.txt
push "route 172.16.0.0 255.255.0.0"
push "route 172.18.0.0 255.255.0.0"
push "route 172.20.0.0 255.255.0.0"
keepalive 10 120
tls-auth /etc/openvpn/server/easy-rsa/pki/private/ta.key 0
cipher AES-256-CBC
persist-key
persist-tun
status openvpn-status.log
verb 3
explicit-exit-notify 1


Now you can start OpenVPN server


# start OpenVPN server
# ... from CLI
openvpn /etc/openvpn/server/server.conf
# ... as a service
systemctl start openvpn-server@server


C. Enable IP forwarding and routing for VPN clients to access WAN


1. Enable forwarding on the VPN server:


echo 1 > /proc/sys/net/ipv4/ip_forward
sysctl -w net.ipv4.ip_forward=1
echo "net.ipv4.ip_forward=1" > /etc/sysctl.d/70-openvpn-forwarding.conf


2. Disable source/destination checking on the EC2 instance main-edge-vpn.


3. Add route 10.8.0.0/24 -> main-edge-vpn to both public and private routing tables main-edge-private-rt-a and main-edge-public-rt.


4. Add a static route 10.8.0.0/24 -> main-edge-tgw-attachment on the transit gateway main-tgw


D. Configure VPN Client


SSH into the VPN server and generate client key and certificate


# generate cert for client on the server

cd /etc/openvpn/server/easy-rsa

# the following command will ask to enter password ("password") and name ("client1")
/usr/share/easy-rsa/3/easyrsa gen-req client1

# the following command will ask to confirm the certificate parameters ("yes") and enter password ("password")
/usr/share/easy-rsa/3/easyrsa sign-req client client1

# remove password from the key
cd /etc/openvpn/server/easy-rsa/pki/private/
mv client1.key client1.key.orig
openssl rsa -in client1.key.orig -out client1.key


Copy the following files to the client:

/etc/openvpn/server/easy-rsa/pki/ca.crt

/etc/openvpn/server/easy-rsa/pki/issued/client1.crt
/etc/openvpn/server/easy-rsa/pki/private/client1.key
/etc/openvpn/server/easy-rsa/pki/private/ta.key

Create OpenVPN client config file config.ovpn on the client:


# set here the public IP address of the instance main-edge-vpn
remote <main-edge-vpn-public-IP-address> 1194

# adjust mtu if necessary
tun-mtu 1000

dev tun
client
float
pull
proto udp
script-security 2
cipher AES-256-CBC
ca ca.crt
cert client1.crt
key client1.key
remote-cert-tls server
tls-auth ta.key 1


E. Start VPN client and test the connection


Start VPN client:


sudo openvpn --config config.ovpn

Test connection:


ping 10.8.0.1