Tags: aws, security, network
TABLE OF CONTENTS
- Overview
- Initial setup
- Create Kublr Kubernetes Cluster in the private WAN
- Reference and additional information
- Appendix I: Private WAN with Transit Gateway and OpenVPN
Overview
It is often a requirement for a global company to build and operate a distributed IT infrastructure to enable geographically distributed teams and IT components to stay connected and operate as a single entity across a Wide Area Network (WAN) in a secure, reliable and easy to manage way.
Even smaller and more localized businesses can benefit from IT infrastructure distributed across different regions, clouds and data centers for a multitude of reasons - from reliability, high availability, backup and disaster recovery, to security, and latency and performance optimization.
Cloud providers provide technical solutions enabling such distributed network architectures for a long time, and AWS is no exception with services ranging from AWS VPC Peering to AWS Transit Gateway.
AWS Transit Gateway is arguable one of the most flexible tools for building an enterprise WAN.
AWS Transit Gateway is essentially a software defined distributed routing solution that allows to securily and reliably connect Amazon VPCs, Site-to-Site VPN connections, and Direct Connect Gateways within single region as well as across regions (with recently introduced inter-region peering).
This article describes how Kublr and AWS Transit Gateway can be used to setup a distributed in which Kublr provisions and manages Kubernetes clusters can communicate that are automatically connected to a global private WAN implemented via AWS Transit Gateway.
Initial setup
For this article, as an example, we assume that the company is building a WAN architecture with AWS Transit Gateway using the approach with centralized outbound routing to the internet.
We assume that the enterprise IT manages the global WAN architecture and components and have already provided the following elements of an enterprise global WAN:
- AWS Transit Gateway, or multiple Transit Gateways in different regions with gateway peering established for inter-region connectivity
- Subnet address ranges for various purposes, from which address ranges are assigned for individual VPCs. For example, the enterprise IT team may pick 172.16.0.0/12 as their main private CIDR; then assign CIDRs 172.16.0.0/16 and 172.18.0.0/16 for VPCs in regions us-east-1 and us-west-2 correspondingly; and the CIDR 172.19.0.0/16 - for the east coast data center use.
- Outbound internet routing VPC in each region used to connect the WAN to the public network (if needed)
- Direct Connect Gateways or inbound VPN servers to allow used to connect to the private WAN.
See Appendix I for instructions on configuring a test environment like this to test Kublr deploying and managing Kubernetes clusters connected to the enterprise WAN.
The following sections provide guidance on using Kublr with such a private WAN.
Create Kublr Kubernetes Cluster in the private WAN
Kublr always creates an AWS Kubernetes cluster in a VPC but there are two options for the VPC ownership.
The cluster VPC can be created by Kublr together with the subnets, routing tables and other AWS resources that are required.
Alternatively Kublr can create a cluster using an existing VPC and network infrastructure.
These two approaches may be used together or interchangeably depending on how ownership of different elements of infrastructure is distributed in the organization.
The following sections describe both of these options.
Option 1: Deploy cluster in an existing VPC
A. Create a VPC and subnets for the cluster connected to the TGW WAN
1. Create a VPC for the cluster using AWS CLI or Console
Name: main-cluster1-vpc
IPv4 CIDR block: 172.16.0.0/16
2. Create subnet(s):
Name: main-cluster1-subnet-a
Availability Zone: us-east-1a
IPv4 CIDR block: 172.16.0.0/20
Tag: "kubernetes.io/role/internal-elb": "1"
Name: main-cluster1-subnet-c
Availability Zone: us-east-1c
IPv4 CIDR block: 172.16.32.0/20
Tag: "kubernetes.io/role/internal-elb": "1"
3. Create a TGW Attachment
Name: main-cluster1-tgw-attachment
Transit Gateway ID: main-tgw
Attachment Type: VPC
DNS Support: on
IPv6 Support: off
VPC ID: main-cluster1-vpc
Subnet IDs: main-cluster1-subnet-a, main-cluster1-subnet-c
4. Add TGW route to the default routing table for main-cluster1-vpc
0.0.0.0/0 -> main-cluster1-tgw-attachment
B. Create Kublr Kubernetes cluster in main-cluster1-vpc
Note the following elements of the cluster specification:
- spec.locations[0].aws.vpcId - must be set to the ID of main-cluster1-vpc VPC
- spec.locations[0].aws.vpcCidrBlock - must be set to the same IPv4 range as the main-cluster1-vpc VPC
- spec.master.locations[0].aws.availabilityZones and spec.nodes[*].locations[0].aws.availabilityZones - must be set according to availability zones in which corresponding instance groups should create instances
- spec.master.locations[0].aws.subnetIds and spec.nodes[*].locations[0].aws.subnetIds - must be set to IDs of the subnets in corresponding availability zones in in which the instance groups should create the instances
Other parameters specified in the cluster specification snippet make sure that the EC2 instances are correctly configured to run in the private WAN environment - disabling elastic IP and public IP assignment, and setting master and ingress load balancer types to internal NLB.
spec: locations: - aws: vpcId: <main-cluster1-vpc-id> vpcCidrBlock: 172.16.0.0/16 # the cluster VPC IPv4 CIDR skipPublicSubnetsForPrivateGroups: true natMode: none skipInternetGateway: true # master node groups configurations master: locations: - aws: availabilityZones: - us-east-1a subnetIds: - <subnet-id-az-us-east-1a> masterNlbAllocationPolicy: private eipAllocationPolicy: none nodeIpAllocationPolicy: private # worker node groups configurations nodes: - name: group1 locations: - aws: availabilityZones: - us-east-1a subnetIds: - <subnet-id-az-us-east-1a> eipAllocationPolicy: none nodeIpAllocationPolicy: private # if ingress controller is enabled, the following settings # change the ingress controller load balancer type to internal NLB, # which makes it available inside the WAN features ingress: values: nginx-ingress: controller: service: annotations: service.beta.kubernetes.io/aws-load-balancer-type: "nlb" service.beta.kubernetes.io/aws-load-balancer-internal: "true"
Option 2: Deploy cluster in Kublr-created VPC
With this option Kublr will create VPC and other resource automatically, and will automatically attach the VPC to the TGW.
No manually created AWS resources are needed with this option, but the cluster specification is a bit more complex due to included additional AWS resources.
Note the following elements of the cluster specification:
- spec.locations[0].aws.vpcCidrBlock - must be set to the IPv4 CIDR assigned for this cluster withing the private WAN
- spec.master.locations[0].aws.availabilityZones and spec.nodes[*].locations[0].aws.availabilityZones - must be set according to availability zones in which corresponding instance groups should create instances
- spec.locations[0].aws.resourcesCloudFormationExtras.*.TransitGatewayId - must be set to the ID of the private WAN transit gateway
- spec.locations[0].aws.resourcesCloudFormationExtras.TransitGatewayAttachment.Properties.SubnetIds - must be set to the list of references to the subnets created by Kublr for master and/or worker nodes; one subnet should be selected for each availability zone where the transit gateway attachment should have endponts. Refer to Kublr refrence documentation on auto-generated subnets for more information on Kublr cluster AWS network architecture and subnets naming.
- spec.locations[0].aws.resourcesCloudFormationExtras.SubnetRTAssoc* - one route table association must be created for each subnet generated by Kublr in this cluster.
One subnet will be create for each AZ used by master instance group, and one subnet - for each AZ used by all worker groups combined.
Refer to Kublr refrence documentation on auto-generated subnets for more information on Kublr cluster AWS network architecture and subnets naming.
Other parameters specified in the cluster specification snippet make sure that the EC2 instances are correctly configured to run in the private WAN environment - disabling elastic IP and public IP assignment, and setting master and ingress load balancer types to internal NLB.
spec: locations: - aws: vpcCidrBlock: 172.16.0.0/16 # the cluster VPC IPv4 CIDR skipPublicSubnetsForPrivateGroups: true natMode: none skipInternetGateway: true # this section includes additional AWS resources connecting the cluster VPC to the TGW WAN resourcesCloudFormationExtras: # VPC-TGW attachment TransitGatewayVpcAttachment: Type: AWS::EC2::TransitGatewayVpcAttachment Properties: TransitGatewayId: <main-tgw-id> # ID of the TGW VpcId: { Ref: NewVpc } SubnetIds: - { Ref: SubnetMasterPrivate0 } # VPC-TGW attachment TransitGatewayRoute: Type: AWS::EC2::Route Properties: DestinationCidrBlock: 0.0.0.0/0 RouteTableId: { Ref: RouteTablePublic } TransitGatewayId: <main-tgw-id> # ID of the TGW DependsOn: TransitGatewayVpcAttachment # For each master AZ SubnetRTAssocMasterPrivate0: Type: AWS::EC2::SubnetRouteTableAssociation Properties: RouteTableId: { Ref: RouteTablePublic } SubnetId: { Ref: SubnetMasterPrivate0 } # For each worker AZ SubnetRTAssocNodePrivate0: Type: AWS::EC2::SubnetRouteTableAssociation Properties: RouteTableId: { Ref: RouteTablePublic } SubnetId: { Ref: SubnetNodePrivate0 } # master node groups configurations master: locations: - aws: availabilityZones: - us-east-1a masterNlbAllocationPolicy: private eipAllocationPolicy: none nodeIpAllocationPolicy: private # worker node groups configurations nodes: - name: group1 locations: - aws: availabilityZones: - us-east-1a eipAllocationPolicy: none nodeIpAllocationPolicy: private # if ingress controller is enabled, the following settings # change the ingress controller load balancer type to internal NLB, # which makes it available inside the WAN features ingress: values: nginx-ingress: controller: service: annotations: service.beta.kubernetes.io/aws-load-balancer-type: "nlb" service.beta.kubernetes.io/aws-load-balancer-internal: "true"
Reference and additional information
- AWS blog on building a global network using AWS Transit Gateway Inter-Region peering
- AWS Transig Gateway documentation
- AWS Transig Gateway scenarios
- AWS Transit Gatewau example: Centralized outbound routing to the internet
Appendix I: Private WAN with Transit Gateway and OpenVPN
This section describes how to set up a test/PoC environment with Transit Gateway connecting private VPC's and routing all outbound traffic through one dedicated VPC, enabling single point protection with, for example, AWS Firewall.
We will configure the following architecture:
TODO: diagram
1. Setup Transit Gateway and dedicated edge VPC
A. Create an AWS Transit Gateway (TGW) via AWS CLI or Console
You can select any name for the TGW, we will use main-tgw in this example.
If you are creating the TGW via console, leave all other settings with their default values: no ASN, enabled DNS and VPN ECMP support, enabled default route table association and propagation, and disabled multicast support.
B. Create a dedicated edge VPC
Any name can be used, we will use main-edge-vpc here.
IPv4 CIDR block: 172.20.0.0/20
B.1. Create a public subnet in the edge VPC
Name: main-edge-subnet-public-a
Availability zone: us-east-1a
IPv4 CIDR block: 172.20.0.0/24
B.2. Create a private subnet in the edge VPC
Name: main-edge-subnet-private-a
Availability zone: us-east-1a
IPv4 CIDR block: 172.20.8.0/24
B.3. Create a new Internet Gateway for the edge VPC and attach it
Name: main-edge-igw
Attach the Internet Gateway to main-edge-vpc VPC.
B.4. Create a new NAT Gateway for the edge VPC
Name: main-edge-natgw-a
Subnet: main-edge-subnet-public-a
Elastic IP allocation ID: allocate a new Elastic IP
B.5. Create a private route table in the edge VPC
Name: main-edge-private-rt-a
Associate with subnet: main-edge-subnet-private-a
Add route: 0.0.0.0/0 -> main-edge-natgw-a
B.6. Create a public route table in the edge VPC
Name: main-edge-public-rt
Associate with subnet: main-edge-subnet-public-a
Add route: 0.0.0.0/0 -> main-edge-igw
C. Attach the edge VPC to the transit gateway and configure WAN routing in the edge VPC
C.1. Create a new Transit Gateway Attachment
Name: main-edge-tgw-attachment
Transit Gateway ID: main-tgw
Attachment Type: VPC
DNS Support: on
IPv6 Support: off
VPC ID: main-edge-vpc
Subnet IDs: main-edge-subnet-private-a
C.2. Configure WAN routing in the edge VPC
1. Add route 172.16.0.0/12 -> main-edge-tgw-attachment in both edge routing tables main-edge-private-rt-a and main-edge-public-rt
2. Add a static route 0.0.0.0/0 -> main-edge-tgw-attachment on the transit gateway main-tgw
Now you have a WAN transit gateway with an edge VPC configured for public network access.
The transit gateway is ready to accept other VPC attachments.
2. Create OpenVPN server in edge VPC and configure VPN access to WAN
A. Launch a new EC2 instance in the public subnet of the edge VPC
AMI: Amazon Linux 2 AMI (HVM), SSD Volume Type
Instance Type: t3a.nano
Network: main-edge-vpc
Subnet: main-edge-subnet-public-a
Auto-assign Public IP: enable
Name: main-edge-vpn
Secuity group: open ports 22/tcp, 1194/tcp, and 1194/udp
Key pair: specify the SSH key that you can use to login into the instance
Disable source check on the EC2 instance main-edge-vpn.
Add route 10.8.0.0/24 -> main-edge-vpn to both public and private routing tables main-edge-private-rt-a and main-edge-public-rt.
B. Install and configure OpenVPN
SSH into the instance and install and configure OpenVPN server as follows:
# change to root sudo -i # update and install required packages amazon-linux-extras install epel yum update -y yum install -y openvpn easy-rsa # generate CA mkdir ~/easy-rsa cd ~/easy-rsa/ /usr/share/easy-rsa/3/easyrsa init-pki # the following command will ask to enter password (e.g. "password") and common name (e.g. "openvpn") /usr/share/easy-rsa/3/easyrsa build-ca /usr/share/easy-rsa/3/easyrsa gen-dh # the following command will ask to enter password (e.g. "password") and common name (e.g. "openvpnserver") /usr/share/easy-rsa/3/easyrsa gen-req server # the following command will ask to confirm ("yes"), enter password ("password") /usr/share/easy-rsa/3/easyrsa sign-req server server # move key/cert files to /etc/openvpn/server cd mv ~/easy-rsa /etc/openvpn/server # remove password from the private key cd /etc/openvpn/server/easy-rsa/pki/private/ mv server.key server.key.orig # the following command will ask to enter password ("password") openssl rsa -in server.key.orig -out server.key
Configure OpenVPN:
cd /etc/openvpn/server openvpn --genkey --secret ta.key mv ta.key ./easy-rsa/pki/private/ cp /usr/share/doc/openvpn-*/sample/sample-config-files/server.conf /etc/openvpn/server/ # edit OpenVPN server configuration file vim /etc/openvpn/server/server.conf
Set the following parameters in the configuration file:
ca /etc/openvpn/server/easy-rsa/pki/ca.crt cert /etc/openvpn/server/easy-rsa/pki/issued/server.crt key /etc/openvpn/server/easy-rsa/pki/private/server.key dh /etc/openvpn/server/easy-rsa/pki/dh.pem tls-auth /etc/openvpn/server/easy-rsa/pki/private/ta.key 0 # here configure all the subnets routes to which you want the server to push to the clients push "route 172.16.0.0 255.255.0.0" push "route 172.18.0.0 255.255.0.0" push "route 172.20.0.0 255.255.0.0"
As a result the server config file will look as follows:
port 1194 proto udp dev tun ca /etc/openvpn/server/easy-rsa/pki/ca.crt cert /etc/openvpn/server/easy-rsa/pki/issued/server.crt key /etc/openvpn/server/easy-rsa/pki/private/server.key dh /etc/openvpn/server/easy-rsa/pki/dh.pem server 10.8.0.0 255.255.255.0 ifconfig-pool-persist ipp.txt push "route 172.16.0.0 255.255.0.0" push "route 172.18.0.0 255.255.0.0" push "route 172.20.0.0 255.255.0.0" keepalive 10 120 tls-auth /etc/openvpn/server/easy-rsa/pki/private/ta.key 0 cipher AES-256-CBC persist-key persist-tun status openvpn-status.log verb 3 explicit-exit-notify 1
Now you can start OpenVPN server
# start OpenVPN server # ... from CLI openvpn /etc/openvpn/server/server.conf # ... as a service systemctl start openvpn-server@server
C. Enable IP forwarding and routing for VPN clients to access WAN
1. Enable forwarding on the VPN server:
echo 1 > /proc/sys/net/ipv4/ip_forward sysctl -w net.ipv4.ip_forward=1 echo "net.ipv4.ip_forward=1" > /etc/sysctl.d/70-openvpn-forwarding.conf
2. Disable source/destination checking on the EC2 instance main-edge-vpn.
3. Add route 10.8.0.0/24 -> main-edge-vpn to both public and private routing tables main-edge-private-rt-a and main-edge-public-rt.
4. Add a static route 10.8.0.0/24 -> main-edge-tgw-attachment on the transit gateway main-tgw
D. Configure VPN Client
SSH into the VPN server and generate client key and certificate
# generate cert for client on the server cd /etc/openvpn/server/easy-rsa # the following command will ask to enter password ("password") and name ("client1") /usr/share/easy-rsa/3/easyrsa gen-req client1 # the following command will ask to confirm the certificate parameters ("yes") and enter password ("password") /usr/share/easy-rsa/3/easyrsa sign-req client client1 # remove password from the key cd /etc/openvpn/server/easy-rsa/pki/private/ mv client1.key client1.key.orig openssl rsa -in client1.key.orig -out client1.key
Copy the following files to the client:
/etc/openvpn/server/easy-rsa/pki/ca.crt
/etc/openvpn/server/easy-rsa/pki/issued/client1.crt
/etc/openvpn/server/easy-rsa/pki/private/client1.key
/etc/openvpn/server/easy-rsa/pki/private/ta.key
Create OpenVPN client config file config.ovpn on the client:
# set here the public IP address of the instance main-edge-vpn remote <main-edge-vpn-public-IP-address> 1194 # adjust mtu if necessary tun-mtu 1000 dev tun client float pull proto udp script-security 2 cipher AES-256-CBC ca ca.crt cert client1.crt key client1.key remote-cert-tls server tls-auth ta.key 1
E. Start VPN client and test the connection
Start VPN client:
sudo openvpn --config config.ovpn
Test connection:
ping 10.8.0.1