Kubernetes was designed for the deployment of applications to cloud architecture with containers. Another way of thinking about Kubernetes; it gets us “out-of-the-install-binaries” business and focuses our efforts on the business value of a solution. We have documented our process of how we train our resources and partners. This process will help your team to excel and gain confidence with cloud technologies.
One of the business challenges of Kubernetes in the cloud architecture is the ongoing cost ($300-$600/month per resource) during the learning or development process. To lower this ongoing cost per resource, we focused on a method to use on-prem Kubernetes deployments.
We have found examples online of using minikube and Oracle Virtualbox to assist with keeping costs low while using an on-prem deployment but did not find many examples of using Vmware Workstation to our satisfaction. Our goal was to utilize a solution that we are very familiar with and has the supporting capabilities for rollback via snapshots.
We have used Vmware Workstation for many years while working on service projects. We cannot overstate its usefulness to offer a “play-ground” and development environment independent of a client’s environment. The features of snapshots allow for negative use-case testing or “what-if” scenarios to destroy or impact solutions being tested with minimal impact.
In this entry, we will discuss the use of Vmware Workstation and CentOS (or Ubuntu) as the primary Kubernetes Nodes. Both CentOS and/or Ubuntu OS are used by the cloud providers as their Kubernetes nodes, so this on-prem process will translate well.
Some of our team members run the Kubernetes environment from their laptop, a collection of individual servers, or a larger server that may scale to the number of vCPU/RAM required for the Kubernetes solution.

Decision 1: Choose an OS to be used.
Either CentOS or Ubuntu OS is acceptable to use for on-prem. When we checked the OSes used by the cloud providers, we noted they used one of these two (2) OS for Linux OS. We decided on CentOS 7, as iptables for routing are used within Kubernetes; and iptables are used by default in CentOS 7. You may find that other OSes will work fine as well.
Decision 2: Build a reference image
Identify all expected binaries to be used within this image. This reference image will be cloned for the Kubernetes control plane node (1) and the worker nodes (3-4). We will also use this image to build a supporting node (non-Kubernetes) for SiteMinder integration and a docker repository for the Kubernetes docker images. For a total of six (6) nodes.
Decision 3: DNS and Certificates
Recommendation: Please do not attempt to deploy a Kubernetes solution on-prem without having purchased a DNS domain/site and use wild card certificates tied to the DNS domain.
Without these two (2) supporting components, it is a challenge to have a working Kubernetes solution that reflects what you will experience in a cloud deployment.
For example, we purchased a domain for $12/year, and then created several “A” records that will host the IP addresses we may use to redirect to cloud or on-prem. Using sub-domains “A” records, we can have as many cloud addresses as we wish.
DNS "A" Records Example:
aks.iam.anapartner.net (MS Azure),
eks.iam.anapartner.net (Amazon),
gke.iam.anapartner.net (Google).
DNS "CNAME" Records Example:
alertmanager.aks.iam.anapartner.net,
grafana.aks.iam.anapartner.net,
jaeger.aks.iam.anapartner.net,
kibana.aks.iam.anapartner.net,
mgmt-ssp.aks.iam.anapartner.net,
sm.aks.iam.anapartner.net,
ssp.aks.iam.anapartner.net.

Finally, we prefer to use wildcard certificates for these domains to avoid challenges within our Kubernetes deployment. There are several services out there offering free certificates.
We chose Let’sEncrypt https://letsencrypt.org/. While Let’sEncrypt has automated processes to renew their certs, we chose to use their DNS validation process with a CertBot solution. We can renew these certificates every 90 days for on-prem usage. The DNS validation process requires a unique string generated by the Let’sEncrypt process to be populated in a DNS “TXT” record like so: _acme-challenge.aks.iam.anapartner.net . See the example at the bottom of this blog entry on this process.
Decision 4: Supporting Components: Storage, Load-Balancing, DNS Resolution (Local)
The last step required for on-prem deployment is where will you decide to place persistence storage for your Kubernetes cluster. We chose to use an NFS share.
We first tested using the control-plane node, then decided to move the NFS share to a Synology NAS solution. Similar for the DNS resolution option, at first we used a DNS service on the control-plane node and then moved to to the Synology NAS solution.
For Load-Balancing, Kubernetes has a service option of NodePort and LoadBalancing. The LoadBalancing service if not deployed in the cloud, will default to NodePort behavior. To introduce load balancing for on-prem, we introduced the HA-proxy service on the control-plane node, along with Kubernetes NodePort service to meet this goal.
After the decisions have been made, we can now walk through the steps to set up a Vmware environment for Kubernetes.
Reference Image
Step 1: Download the OS DVD ISO image for deployment on Vmware Workstation (Centos 7 / Ubuntu ).
Determine specs for the future solution to be deployed on Kubernetes. Some solutions have pods that may require minimal memory/disc space. For the solution we decided on deploying, we confirmed that we need 16 GB RAM and 4vCPU minimal. We have confirmed these specs were required by previously deploying the solution in a cloud environment.
Without these memory/cpu specs, the solution that we chose would pause the deployment of Kubernetes pods to the nodes. You may or may not see error messages in the deployment of pods stating that the nodes did not have enough resources for all or some of the pods.
For disc size, we selected 100 GB to future-proof the solution during testing. For networking, please select BRIDGED mode, to allow the Vmware images to have minimal network issues when routing within your local network. Please avoid double NAT’ing the deployment to reduce your headaches.



Step 2: Install useful base packages and disable any UI tools. Please install an Entropy Daemon to avoid delays due to certificates usage of /dev/random and low entropy.
### UI Update for CentOS7 was stopping yum deployment - not required for our solution to be tested (e.g. VIP Auth Hub)
# su to root to run the below commands. We will add sudo access later.
su -
systemctl disable packagekit; systemctl stop packagekit; systemctl status packagekit
### Installed base useful packages.
yum -y install dnf epel-release yum-utils nfs-utils
### Install useful 2nd tools.
yum -y install openldap-clients jq python3-pip tree
pip3 install yq
yum -y upgrade
### Install Entropy process (epel repo)
dnf -y install haveged
systemctl enable haveged --now
Step 3: Install docker and update the docker configuration for use with Kubernetes. Update the path & storage-driver for the docker images for initial deployment.
Ref: https://docs.docker.com/storage/storagedriver/overlayfs-driver/
### Install Docker repo & docker package
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
dnf -y install docker-ce
docker version
systemctl enable docker --now
docker version
### Update docker image info after deployment and restart service
cat << EOF > /etc/docker/daemon.json
{
"debug": false,
"data-root": "/home/docker-images",
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}
EOF
### Restart docker to load updated image info.
systemctl restart docker; systemctl status docker; docker version
Step 4: Deploy the three (3) primary Kubernetes & the HELM binaries.
Ensure you select a Kubernetes version that matches what solution you wish to deploy and work with. This can be a gotcha if the Kubernetes binaries update during a dnf / yum upgrade process and your solution has not been vetted for the newer release of Kubernetes. See the reference link below on how to upgrade Kubernetes binaries.
Ref: https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
### Add k8s repo
cat << EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
exclude=kubelet kubeadm kubectl
EOF
### When upgrading the OS, be sure to use the correct version of kubernetes (remove and add) - Example to force version 1.20.11 ###
dnf upgrade -y
dnf remove -y kubelet kubeadm kubectl
dnf install -y kubelet-1.20.11-0.x86_64 kubeadm-1.20.11-0.x86_64 kubectl-1.20.11-0.x86_64 --disableexcludes=kubernetes
### Start the k8s process.
systemctl enable kubelet --now; systemctl status kubelet
systemctl daemon-reload && systemctl enable kubelet --now
yum-config-manager --save --setopt=kubernetes.skip_if_unavailable=true
### Add HELM binary
curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3
chmod 700 get_helm.sh
./get_helm.sh
Step 5: OS configurations required or useful for Kubernetes. Kubernetes kubelet binary requires SWAP to be disabled.
Ref: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
### Stop FirewallD - May add ports later for security
systemctl stop firewalld;systemctl disable firewalld; iptables -F
### Update OS Parameters for kubernetes
setenforce 0
sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux
modprobe br_netfilter
cat << EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system
### Note: IP forwarding is enabled by default.
sysctl -a | grep -i forward
### Note: Update /etc/fstab to comment out swap line with # character
### Warning: kubectl init will fail if swap is left on cp or any worker node.
swapoff -a
sed -i 's|UUID\=\(.*\)-\(.*\)-\(.*\)-\(.*\)-\(.*\) swap|#UUID\=\1-\2-\3-\4-\5 swap|g' /etc/fstab
cat /etc/fstab
Step 6: Create SSH key for root or other services IDs to allow remote script updates from CP to Worker Nodes
### Create SSH key for root to allow remote script updates from CP to Worker Nodes - Enter a Blank/Null PASSWORD.
su -
rm -rf ~/.ssh; echo y | ssh-keygen -b 4096 -C $USER -f ~/.ssh/id_rsa
### Copy the public rsa key to authorized keys to avoid password between cp/worker nodes for remote ssh commands.
cp -r -p ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys;chmod 600 ~/.ssh/authorized_keys;ls -lart .ssh
### Test for remote connection with no password:
ssh -i ~/.ssh/id_rsa root@localhost
### Copy the id_rsa key to your host system for ease of testing.
### Add your local non-root user to sudo wheel group. Change vip to your user ID.
LOCALUSER=vip
gpasswd -a $LOCALUSER wheel
### Update sudoers file to allow wheel group with no-password
sed -i 's|# %wheel|%wheel|g' /etc/sudoers
### View update wheel group.
grep "%wheel" /etc/sudoers
# Example of return query.
# %wheel ALL=(ALL) ALL
# %wheel ALL=(ALL) NOPASSWD: ALL
Step 7: Stop or adjust the OS network manager, shutdown the reference image, and create a Vmware Snapshot
### Adjust or Disable the OS NetworkManager (to avoid overwriting /etc/resolv.conf)
### Important when using an internal DNS server.
systemctl disable NetworkManager;systemctl stop NetworkManager
### reboot CentOS7 Image and validate no issues upon reboot.
reboot
### Shutdown image and manually create snapshot called "base"
Vmware Workstation Cloning
Step 8: Now that we have a reference image, we can now make clone images for the control-plane (1), the worker nodes (4), and the supporting node (1). This is a fairly quick process.
export BASE=/home/me/vmware/kub
export REF=/home/me/vmware/kub/CentOS7/CentOS7.vmx
VM=cp;mkdir -p $BASE/$VM; time vmrun -T ws clone $REF $BASE/$VM/$VM.vmx -cloneName=$VM -snapshot=base full
VM=worker01;mkdir -p $BASE/$VM; time vmrun -T ws clone $REF $BASE/$VM/$VM.vmx -cloneName=$VM -snapshot=base full
VM=worker02;mkdir -p $BASE/$VM; time vmrun -T ws clone $REF $BASE/$VM/$VM.vmx -cloneName=$VM -snapshot=base full
VM=worker03;mkdir -p $BASE/$VM; time vmrun -T ws clone $REF $BASE/$VM/$VM.vmx -cloneName=$VM -snapshot=base full
VM=worker04;mkdir -p $BASE/$VM; time vmrun -T ws clone $REF $BASE/$VM/$VM.vmx -cloneName=$VM -snapshot=base full
VM=sm;mkdir -p $BASE/$VM; time vmrun -T ws clone $REF $BASE/$VM/$VM.vmx -cloneName=$VM -snapshot=base full
Step 9: Start the clone images and remotely assign new hostname/IP addresses to the images
# Start cloned images for CP and Worker Nodes - Update any files as needed.
export DOMAIN=aks.iam.anapartner.net
export PASSWORD_VM=Password01
### Start the cloned images for CP and Worker Nodes.
VM=cp;vmrun -T ws start $BASE/$VM/$VM.vmx nogui
VM=worker01;vmrun -T ws start $BASE/$VM/$VM.vmx nogui
VM=worker02;vmrun -T ws start $BASE/$VM/$VM.vmx nogui
VM=worker03;vmrun -T ws start $BASE/$VM/$VM.vmx nogui
VM=worker04;vmrun -T ws start $BASE/$VM/$VM.vmx nogui
VM=sm;vmrun -T ws start $BASE/$VM/$VM.vmx nogui
vmrun -T ws list | sort -rn
### Update Hostnames for CP and Worker Nodes with Domain.
VM=cp;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "hostnamectl set-hostname $VM.$DOMAIN" -noWait
VM=worker01;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "hostnamectl set-hostname $VM.$DOMAIN" -noWait
VM=worker02;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "hostnamectl set-hostname $VM.$DOMAIN" -noWait
VM=worker03;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "hostnamectl set-hostname $VM.$DOMAIN" -noWait
VM=worker04;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "hostnamectl set-hostname $VM.$DOMAIN" -noWait
VM=sm;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "hostnamectl set-hostname $VM.$DOMAIN" -noWait
### Update IP Address and Domain for NIC (ifcfg-ens33)
export CP=192.168.2.60
export WK1=192.168.2.61
export WK2=192.168.2.62
export WK3=192.168.2.63
export WK4=192.168.2.64
export SM=192.168.2.65
VM=cp;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "sed -i 's|TYPE=\"Ethernet\"|TYPE=\"Ethernet\"\nIPADDR=$CP\nDOMAIN=$DOMAIN|g' /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=worker01;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "sed -i 's|TYPE=\"Ethernet\"|TYPE=\"Ethernet\"\nIPADDR=$WK1\nDOMAIN=$DOMAIN|g' /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=worker02;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "sed -i 's|TYPE=\"Ethernet\"|TYPE=\"Ethernet\"\nIPADDR=$WK2\nDOMAIN=$DOMAIN|g' /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=worker03;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "sed -i 's|TYPE=\"Ethernet\"|TYPE=\"Ethernet\"\nIPADDR=$WK3\nDOMAIN=$DOMAIN|g' /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=worker04;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "sed -i 's|TYPE=\"Ethernet\"|TYPE=\"Ethernet\"\nIPADDR=$WK4\nDOMAIN=$DOMAIN|g' /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=sm;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "sed -i 's|TYPE=\"Ethernet\"|TYPE=\"Ethernet\"\nIPADDR=$SM\nDOMAIN=$DOMAIN|g' /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
Step 10: Enable the network gateway, disable DHCP, and reboot the images
export DOMAIN=aks.iam.anapartner.net
export PASSWORD_VM=Password01
### Update to create a new default GATEWAY HOST to address routing issues to external IP addresses.
GATEWAY=192.168.2.1
VM=cp;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "sed -i 's|# Created by anaconda|# Created by anaconda\nGATEWAY=$GATEWAY|g' /etc/sysconfig/network" -noWait
VM=worker01;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "sed -i 's|# Created by anaconda|# Created by anaconda\nGATEWAY=$GATEWAY|g' /etc/sysconfig/network" -noWait
VM=worker02;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "sed -i 's|# Created by anaconda|# Created by anaconda\nGATEWAY=$GATEWAY|g' /etc/sysconfig/network" -noWait
VM=worker03;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "sed -i 's|# Created by anaconda|# Created by anaconda\nGATEWAY=$GATEWAY|g' /etc/sysconfig/network" -noWait
VM=worker04;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "sed -i 's|# Created by anaconda|# Created by anaconda\nGATEWAY=$GATEWAY|g' /etc/sysconfig/network" -noWait
VM=sm;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "sed -i 's|# Created by anaconda|# Created by anaconda\nGATEWAY=$GATEWAY|g' /etc/sysconfig/network" -noWait
### Disable DHCP (to avoid overwriting /etc/resolv.conf)
VM=cp;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "sed -i 's|BOOTPROTO=\"dhcp\"|BOOTPROTO=\"none\"|g' /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=worker01;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "sed -i 's|BOOTPROTO=\"dhcp\"|BOOTPROTO=\"none\"|g' /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=worker02;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "sed -i 's|BOOTPROTO=\"dhcp\"|BOOTPROTO=\"none\"|g' /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=worker03;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "sed -i 's|BOOTPROTO=\"dhcp\"|BOOTPROTO=\"none\"|g' /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=worker04;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "sed -i 's|BOOTPROTO=\"dhcp\"|BOOTPROTO=\"none\"|g' /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=sm;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "sed -i 's|BOOTPROTO=\"dhcp\"|BOOTPROTO=\"none\"|g' /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
### Reboot VIP Auth Hub CP and Nodes
VM=cp;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "reboot" -noWait
VM=worker01;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "reboot" -noWait
VM=worker02;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "reboot" -noWait
VM=worker03;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "reboot" -noWait
VM=worker04;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "reboot" -noWait
VM=sm;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "reboot" -noWait
Step 11: Update DNS on the clone images remotely using vmrun
### Update /etc/resolv.conf for correct DNS server.
### Ensure DHCP and Network Manager are disable to prevent these services from overwrite behavior.
export DOMAIN=aks.iam.anapartner.net
export PASSWORD_VM=Password01
DNSNEW=192.168.2.20
VM=cp;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "echo 'nameserver $DNSNEW' >> /etc/resolv.conf" -noWait
VM=worker01;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "echo 'nameserver $DNSNEW' >> /etc/resolv.conf" -noWait
VM=worker02;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "echo 'nameserver $DNSNEW' >> /etc/resolv.conf" -noWait
VM=worker03;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "echo 'nameserver $DNSNEW' >> /etc/resolv.conf" -noWait
VM=worker04;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "echo 'nameserver $DNSNEW' >> /etc/resolv.conf" -noWait
VM=sm;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "echo 'nameserver $DNSNEW' >> /etc/resolv.conf" -noWait
### Reboot VIP Auth Hub CP and Nodes
VM=cp;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "reboot" -noWait
VM=worker01;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "reboot" -noWait
VM=worker02;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "reboot" -noWait
VM=worker03;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "reboot" -noWait
VM=worker04;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "reboot" -noWait
VM=sm;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx /bin/bash "reboot" -noWait
Step 12: Copy the root .ssh public cert to your main host, rename it to a useful name and these test your newly deployed clone images for DSN resolution using ssh. Please confirm this step is successful prior to continuing with the configuration of the control plane and worker nodes.
### Copy the root id_rsa file to host system to allow ease of testing with ssh.
export CP=192.168.2.60
export WK1=192.168.2.61
export WK2=192.168.2.62
export WK3=192.168.2.63
export WK4=192.168.2.64
export SM=192.168.2.65
### Add the hosts for ssh pre-validation.
ssh-keyscan -p 22 $CP >> ~/.ssh/known_hosts
ssh-keyscan -p 22 $WK1 >> ~/.ssh/known_hosts
ssh-keyscan -p 22 $WK2 >> ~/.ssh/known_hosts
ssh-keyscan -p 22 $WK3 >> ~/.ssh/known_hosts
ssh-keyscan -p 22 $WK4 >> ~/.ssh/known_hosts
ssh-keyscan -p 22 $SM >> ~/.ssh/known_hosts
### Rename from id_rsa to vip_kub_root_id_rsa
ssh -tt -i ~/vip_kub_root_id_rsa root@$CP 'cat /etc/resolv.conf'
ssh -tt -i ~/vip_kub_root_id_rsa root@$WK1 'cat /etc/resolv.conf'
ssh -tt -i ~/vip_kub_root_id_rsa root@$WK2 'cat /etc/resolv.conf'
ssh -tt -i ~/vip_kub_root_id_rsa root@$WK3 'cat /etc/resolv.conf'
ssh -tt -i ~/vip_kub_root_id_rsa root@$WK4 'cat /etc/resolv.conf'
ssh -tt -i ~/vip_kub_root_id_rsa root@$SM 'cat /etc/resolv.conf'
### Validate Access with ssh to CP and Worker Nodes new IP addresses.
FQDN=ssp.aks.iam.anapartner.net
ssh -tt -i ~/vip_kub_root_id_rsa root@$CP "ping -c 2 $FQDN"
ssh -tt -i ~/vip_kub_root_id_rsa root@$WK1 "ping -c 2 $FQDN"
ssh -tt -i ~/vip_kub_root_id_rsa root@$WK2 "ping -c 2 $FQDN"
ssh -tt -i ~/vip_kub_root_id_rsa root@$WK3 "ping -c 2 $FQDN"
ssh -tt -i ~/vip_kub_root_id_rsa root@$WK4 "ping -c 2 $FQDN"
ssh -tt -i ~/vip_kub_root_id_rsa root@$SM "ping -c 2 $FQDN"
Update CP (controlplane) Node
Step 13a: Copy files to CP Node from Vmware Workstation host and configure the CP node for dedicated CP usage. Recommend using two terminals/sessions to speed up the process. Install HAproxy for Load Balancing, copy the Let’s Encrypt wild card certificates, and copy the Kubernetes solution you will be deploying (scripts/yaml).
### Open Terminal 1 to CP host.
### Add bash completion to have better use of TAB to view parameters.
CP=192.168.2.60
ssh -tt -i ~/vip_kub_root_id_rsa root@$CP
dnf -y install bash-completion
echo 'export KUBECONFIG=/etc/kubernetes/admin.conf' >>~/.bashrc
kubectl completion bash >/etc/bash_completion.d/kubectl
echo "alias k=kubectl | complete -F __start_kubectl k" >>~/.bashrc
### Install HAProxy and replace the haproxy.cfg file.
dnf -y install haproxy
systemctl enable haproxy --now
netstat -anp | grep -i -e haproxy
### Open Terminal 2 to host and push files to CP node.
### Copy HAProxy configuration, certs, and scripts
scp -i ~/vip_kub_root_id_rsa haproxy.cfg root@$CP:/etc/haproxy/haproxy.cfg
scp -i ~/vip_kub_root_id_rsa cloud-certs-aks-eks-gke_exp-202X-01-12.tar root@$CP:
scp -i ~/vip_kub_root_id_rsa 202X-11-03_vip_auth_hub_working_centos7_v2.tar root@$CP:
### On Terminal 1 - on CP host - Restart to use new haproxy configuration file.
systemctl restart haproxy
netstat -anp | grep -i -e haproxy
### Extract CERTS to root home folder
tar -xvf cloud-certs-aks-eks-gke_exp-202X-01-12.tar
### Extract Working Scripts
tar -xvf 202X-11-03_vip_auth_hub_working_centos7_v2.tar
### Update env variables for unique environment within step00 file.
vi step00_kubernetes_env.sh
### Add the env variables to the .bashrc file
echo ". ./step00_kubernetes_env.sh"
Step 13b: Example of /etc/haproxy/haproxy.cfg configuration for Kubernetes Load Balancing functionality for on-prem worker nodes. HAproxy deployed on control plane (CP) node. The example configuration file will route TCP 80/443/389 to one (1) of the four (4) worker nodes. If a Kubernetes NodePort service is enabled for TCP 389 (31888) ports, then this load balancer will function correctly and route the traffic for LDAP traffic as well.
[root@cp ~]# cat /etc/haproxy/haproxy.cfg
global
user haproxy
group haproxy
chroot /var/lib/haproxy
log /dev/log local0
log /dev/log local1 notice
defaults
mode http
log global
retries 2
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 10m
timeout server 10m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
frontend ingress
bind *:80
option tcplog
mode http
option forwardfor
option http-server-close
default_backend kubernetes-ingress-nodes
backend kubernetes-ingress-nodes
mode http
balance roundrobin
server k8s-ingress-0 worker01.aks.iam.anapartner.net:80 check fall 3 rise 2 send-proxy-v2
server k8s-ingress-1 worker02.aks.iam.anapartner.net:80 check fall 3 rise 2 send-proxy-v2
server k8s-ingress-2 worker03.aks.iam.anapartner.net:80 check fall 3 rise 2 send-proxy-v2
server k8s-ingress-2 worker04.aks.iam.anapartner.net:80 check fall 3 rise 2 send-proxy-v2
frontend ingress-https
bind *:443
option tcplog
mode tcp
option forwardfor
option http-server-close
default_backend kubernetes-ingress-nodes-https
backend kubernetes-ingress-nodes-https
mode tcp
balance roundrobin
server k8s-ingress-0 worker01.aks.iam.anapartner.net:443 check fall 3 rise 2 send-proxy-v2
server k8s-ingress-1 worker02.aks.iam.anapartner.net:443 check fall 3 rise 2 send-proxy-v2
server k8s-ingress-2 worker03.aks.iam.anapartner.net:443 check fall 3 rise 2 send-proxy-v2
server k8s-ingress-2 worker04.aks.iam.anapartner.net:443 check fall 3 rise 2 send-proxy-v2
frontend ldap
bind *:389
option tcplog
mode tcp
default_backend kubernetes-nodes-ldap
backend kubernetes-nodes-ldap
mode tcp
balance roundrobin
server k8s-ldap-0 worker01.aks.iam.anapartner.net:31888 check fall 3 rise 2
server k8s-ldap-1 worker02.aks.iam.anapartner.net:31888 check fall 3 rise 2
server k8s-ldap-2 worker03.aks.iam.anapartner.net:31888 check fall 3 rise 2
server k8s-ldap-2 worker04.aks.iam.anapartner.net:31888 check fall 3 rise 2
Deploy Solution on Kubernetes
Step 14: Validate that DNS and Storage are ready before deploying any solution or if you wish to have a base Kubernetes environment to use with the control-plane and four (4). worker nodes.
### Step: Setup NFS Share either on-prem remote server or Synology NFS
### Use version 4.x checkbox for Synology.
### Example of lines on remote Linux Host with NFS share.
yum -y install nfs-utils
systemctl enable --now nfs-server rpcbind
mkdir -p /export/nfsshare ; chown nobody /export/nfsshare ; chmod -R 777 /export/nfsshare
echo "/export/nfsshare *(rw,sync,no_root_squash,insecure)" >> /etc/exports
exportfs -rav; exportfs -v
firewall-cmd --add-service=nfs --permanent
firewall-cmd --add-service={nfs3,mountd,rpc-bind} --permanent
firewall-cmd --reload
#### Setup DNS entries (A and CNAME) for twelve (12) items ( May be on-prem DNS or Synology DNS)
ns.aks.iam.anapartner.net A IP_ADDRESS (192.168.2.60)
aks.iam.anapartner.net NS ns.aks.iam.anapartner.net
cp.aks.iam.anapartner.net A IP_ADDRESS (192.168.2.60)
worker01.aks.iam.anapartner.net A IP_ADDRESS (192.168.2.61)
worker02.aks.iam.anapartner.net A IP_ADDRESS (192.168.2.62)
worker03.aks.iam.anapartner.net A IP_ADDRESS (192.168.2.63)
worker04.aks.iam.anapartner.net A IP_ADDRESS (192.168.2.64)
sm.aks.iam.anapartner.net A IP_ADDRESS (192.168.2.65)
kibana CNAME cp.aks.iam.anapartner.net
grafana CNAME cp.aks.iam.anapartner.net
jaeger CNAME cp.aks.iam.anapartner.net
alertmanager CNAME cp.aks.iam.anapartner.net
ssp CNAME cp.aks.iam.anapartner.net
ssp-mgmt CNAME cp.aks.iam.anapartner.net
### Pre-Step: Enable DNS resolution for external IP addresses
### Enable forwarding to external h/w router and 8.8.8.8
Step 15: Recommendation. Deploy your solution in steps using Kubernetes yaml or Helm charts to assist with debugging any deployment issues. Do not forget to use kubectl logs, and kubectl describe to isolate startup or cert issues.
### Run scripts one-by-one. They will have a watch command in each that will
### provide feedback on the startup processes.
### Total startup from scratch to final with VIP Sample App is about 15-20 minutes.
### Note: Step04 has a different chart variables for on-prem for Symantec Directory.
### Note: /step00_kubernetes_env.sh is called by each script.
./step01_kubernetes_cluster_init_with_worker_nodes.sh
./step02_kubernetes_cluster_with_ingress_and_other_charts.sh
./step03_kubernetes_cluster_with_vip_auth_hub_charts.sh
./step04_kubernetes_cluster_with_vip_auth_hub_sample_app.sh
Docker Registry for On-Prem
There are two (2) types of docker registries we have found useful.
a. The standard Mirror method will capture all docker images from “docker.io” site to a local mirror. When Kubernetes or Helm deployments are used, the docker configuration file can be adjusted to check the local mirror without updating Kubernetes yaml files or Helm charts.
b. The second method is a full query of all images after they have been deployed once, and using the docker push process into a local registry. The challenge of the second method is that the Kubernetes yaml files and/or Helm charts do have to be updated to use this local registry.
Either method will help lower bandwidth cost to re-download the same docker images, if you use a docker prune method to keep your worker nodes disc size “clean”. If the docker prune process is not used, you may notice that the worker nodes may run out of disc space due to temporary docker images/containers that did not clean up properly.
#!/bin/bash
#################################################################################
# Create a local docker mirror registry for docker-ios
# and local docker non-mirror registry for all other images
# to minimize download impact
# during restart of the kubernetes solution
#
# All registry iamges will be placed on NFS share
# mount -v -t nfs 192.168.2.30:/volume1/nfs /mnt &>/dev/null
#
# Certs will be provided by Let's Encrypt every 90 days
#
# For docker-io mirror registry, all clients must have the following line in
# /etc/docker/daemon.json {Note: Use commas as needed}
#
# "registry-mirrors":
# [
# "https://sm.aks.iam.anapartner.net:444"
# ],
#
#
#
# ANA 11/2021
#
#################################################################################
# To remove all containers - to allow restart of process
docker rm -f `docker ps -a | grep -v -e CONTAINER | awk '{print $1}'` ; docker image rm `docker image ls | grep -v -e REPOSITORY | grep -e minutes -e hour -e days -e '2 weeks'| awk '{print $3}'` &>/dev/null
#################################################################################
# Update HOST name for local server for docker image
HOST=sm.aks.iam.anapartner.net
NFS_SERVER=192.168.2.30
NFS_SHARE=/volume1/nfs
#################################################################################
function start_registry {
local_port=$1
remote_registry_name=$2
if [ "$3" == "" ]; then
remote_registry_url=$remote_registry_name
else
remote_registry_url=$3
fi
echo -e "$local_port $remote_registry_name $remote_registry_url"
mount -v -t nfs $NFS_SERVER:$NFS_SHARE /mnt &>/dev/null
mkdir -p /mnt/registry/${remote_registry_name} &>/dev/null
docker run -d --name registry-${remote_registry_name}-mirror \
-p $local_port:443 \
--restart=always \
-e REGISTRY_HTTP_ADDR=0.0.0.0:443 \
-e REGISTRY_PROXY_REMOTEURL="https://${remote_registry_url}/" \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/fullchain.pem \
-e REGISTRY_HTTP_TLS_KEY=/certs/privkey.pem \
-e REGISTRY_COMPATIBILITY_SCHEMA1_ENABLED=true \
-v /mnt/registry/certs:/certs \
-v /mnt/registry/${remote_registry_name}:/var/lib/registry \
registry:latest
sleep 1
echo "#################################################################################"
curl -s -X GET https://$HOST:$local_port/v2/_catalog | jq
echo "#################################################################################"
}
#################################################################################
# start_registry <local_port> <remote_registry_name> <remote_registry_url>
#################################################################################
start_registry 444 docker-io registry-1.docker.io
#################################################################################
# Non-Proxy configuration to allow 'docker tag & docker push' for all other images
#################################################################################
remote_registry_name=all
local_port=455
mkdir -p /var/lib/docker/registry/${remote_registry_name} &>/dev/null
docker run -d --name registry-${remote_registry_name}-mirror \
-p $local_port:443 \
--restart=always \
-e REGISTRY_HTTP_ADDR=0.0.0.0:443 \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/fullchain.pem \
-e REGISTRY_HTTP_TLS_KEY=/certs/privkey.pem \
-e REGISTRY_COMPATIBILITY_SCHEMA1_ENABLED=true \
-v /mnt/registry/certs:/certs \
-v /mnt/registry/${remote_registry_name}:/var/lib/registry \
registry:latest
sleep 1
echo "#################################################################################"
curl -s -X GET https://$HOST:$local_port/v2/_catalog | jq
echo "#################################################################################"
docker ps -a
echo "#################################################################################"
echo "##### To tail the log of the docker-io container - useful for monitoring helm deployments #####"
echo "docker logs `docker ps -a --no-trunc | grep -v NAMES | grep 'docker-io' | awk '{print $1}'` -f "
echo "#################################################################################"
echo "##### To tail the log of the ALL container - useful for monitoring helm deployments #####"
echo "docker logs `docker ps -a --no-trunc | grep -v NAMES | grep 'all' | awk '{print $1}'` -f "
echo "#################################################################################"
echo "##### Location of Registry Files on NFS share #####"
echo "ls -lart /mnt/registry/docker-io/docker/registry/v2/repositories"
echo "ls -lart /mnt/registry/all/docker/registry/v2/repositories"
echo "#################################################################################"
Example of the /etc/docker/daemon.json configuration file to use a local mirror for docker.io. See the parameter of “registry-mirrors”. Unfortunately, we were unable to use this process for the other docker registries.
{
"debug": false,
"data-root": "/home/docker-images",
"exec-opts": ["native.cgroupdriver=systemd"],
"storage-driver": "overlay2",
"registry-mirrors":
[
"https://sm.aks.iam.anapartner.net:444"
],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
}
}
Let’s Encrypt Certbot and DNS validation
Use Let’sEncrypt Certbox and manual DNS validation, to create our 90-day wild card certificates. Manual DNS validation allows us to avoid setting up a public-facing component for our internal labs.
Ref: https://letsencrypt.org/docs/challenge-types/
# Step 1: Install SNAP service for Certbot usage on your host OS
cat /etc/redhat-release
Red Hat Enterprise Linux release 8.3 (Ootpa)
sudo yum install -y snapd
Updating Subscription Management repositories.
Package snapd-2.49-2.el8.x86_64 is already installed.
systemctl enable --now snapd.socket
### Wait 1 min
snap install core; sudo snap refresh core
# Step 2: Remove prior certbot (if installed by yum/dnf)
yum remove -y certbot.
# Step 3: Install new "classic" Certbot
sudo snap install --classic certbot
certbot 1.17.0 from Certbot Project (certbot-eff✓) installed
sudo ln -s /snap/bin/certbot /usr/bin/certbot
# Step 4: Issue certbot command with wildcard cert & update your DNS TXT record with the string provided.
sudo certbot certonly --manual --preferred-challenges dns -d *.aks.iam.anapartner.org --register-unsafely-without-email
Saving debug log to /var/log/letsencrypt/letsencrypt.log
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Please read the Terms of Service at
https://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf. You must
agree in order to register with the ACME server. Do you agree?
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(Y)es/(N)o: Y
Account registered.
Requesting a certificate for *.aks.iam.anapartner.org
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Please deploy a DNS TXT record under the name:
_acme-challenge.iam.anapartner.org.
with the following value:
u2cXXXXXXXXXXXXXXXXXXXXc
Before continuing, verify the TXT record has been deployed. Depending on the DNS
provider, this may take some time, from a few seconds to multiple minutes. You can
check if it has finished deploying with aid of online tools, such as the Google
Admin Toolbox: https://toolbox.googleapps.com/apps/dig/#TXT/_acme-challenge.iam.anapartner.org.
Look for one or more bolded line(s) below the line ';ANSWER'. It should show the
value(s) you've just added.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Step 5: In a 2nd terminal, validate that the DNS record has been updated and can be seen by a standard DNS query. Have the 2nd console window open to test the DNS record, prior to <ENTER> key on verification request
# Example:
nslookup -type=txt _acme-challenge.aks.iam.anapartner.org
Non-authoritative answer:
_acme-challenge.aks.iam.anapartner.org text = "u2cXXXXXXXXXXXXXXXXXXXXc"
# Step 6: Press <ENTER> after you have validated the TXT record.
Press Enter to Continue
Waiting for verification...
Cleaning up challenges
Subscribe to the EFF mailing list (email: nala@baugher.us).
IMPORTANT NOTES:
- Congratulations! Your certificate and chain have been saved at:
/etc/letsencrypt/live/aks.iam.anapartner.org/fullchain.pem
Your key file has been saved at:
/etc/letsencrypt/live/aks.iam.anapartner.org/privkey.pem
# Step 7: View certs of fullchain.pem & privkey.pem
cat /etc/letsencrypt/live/aks.iam.anapartner.org/fullchain.pem
-----BEGIN CERTIFICATE-----
<REMOVED>
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
<REMOVED>
-----END CERTIFICATE-----
cat /etc/letsencrypt/live/aks.iam.anapartner.org/privkey.pem
-----BEGIN PRIVATE KEY-----
<REMOVED>
-----END PRIVATE KEY-----
# Step 8: Use the two files for your kubernetes solution
# Step 9: Ensure domain on host OS, cp, worker nodes in /etc/resolv.conf is set correctly to aks.iam.anapartner.org to allow the certs to be resolved correctly.
# Step 10: Ensure Synology NAS DNS service is configurated with all alias
# Step 11: Optional: Validate certs with openssl
# Show the kubernetes self-signed cert
true | openssl s_client -connect kibana.aks.iam.anapartner.org:443 2>/dev/null | openssl x509 -inform pem -noout -text
# Show the new wildcard cert for same hostname & port
curl -vvI https://kibana.aks.iam.anapartner.org/app/home#/
curl -vvI https://kibana.aks.iam.anapartner.org/app/home#/ 2>&1 | awk 'BEGIN { cert=0 } /^\* SSL connection/ { cert=1 } /^\*/ { if (cert) print }'
nmap -p 443 --script ssl-cert kibana.aks.iam.anapartner.org
Kubernetes Side Note: Let's Encrypt certs do NOT show up within the Kubernetes cluster certs check process.
kubeadm certs check-expiration
View of the DNS TXT records to be updated with your DNS service provider. The Let’sEncrypt Certbot will need to be able to query these records for it to assign you wildcard certificates. Create the _acme-challenge hostname entry as a TXT type, and paste in the string provided by the Let’sEncrypt Certbot process. Wait 5 minutes or test the TXT record with nslookup, then upon positive validation, continue the Let’sEncrypt Certbot process.

View your kubernetes cluster / nodes for any constraints
After your cluster is created and you have worker nodes joined to the cluster, you may wish to monitor for any constraints of your on-prem deployment. Kubectl command with the action verb of describe or top is very useful for this goal.


Kubernetes Training (Formal)
If you are new to Kubernetes, we recommend the following class. You may need to dedicate 4-8 weeks to complete the course and then take the CKA exam via the Linux Foundation.
https://www.udemy.com/course/certified-kubernetes-administrator-with-practice-tests/ .
Kubernetes.io site has most of the information you need to get started.