Secure Application Introspection

Locate “the good, the bad, and the ugly” data with a transparent proxy.

Have you been frustrated with various enterprise/cloud solutions’ APIs implementation or documentation where a single case-sensitive data field entry delays progress? Does the solution have undocumented features for older client tools? Do you wish to know what your mobile apps or laptop sends to the internet?

Utilizing a proxy can help with all the above, and if the process is quick and straightforward, so much the better.

Typically for a proxy, there may be quite a bit of effort and steps. You may need to modify a client host/mobile phone to redirect web traffic with OS environmental variables of HTTP_PROXY and HTTPS_PROXY or adjustment of the underlying OS network/iptables. Prior, we typically set up the open-source Jmeter proxy with the OS environment variables to capture secure traffic data. This process works well for most applications. Additionally, the Firefox browser allows manual modification using a proxy without dependence on the OS environment settings if we wish to capture the user experience and any data challenges.

The example below of modifying a Firefox browser to use a “manual proxy configuration” instead of system/auto configurations.

To ensure accurate capture of web traffic submissions, a more thorough method is needed as the above process may fail if client tools or mobile apps cannot detect OS environmental variables.

We have found a perfect combination within the open-source tool of MITMproxy with podman and the embedded VPN feature of WireGuard.

The process in six (6) steps:

  1. Deployment of the WireGuard VPN client on the client host (MS Win/Linux/Mobile)
  2. Deployment of MITMproxy using podman (or docker) with WireGuard mode/configuration
  3. Edit the wireguard.conf file to have the correct public IP address and import this file to the WireGuard VPN client and establish the VPN connection.
  4. Copy the mitmproxy-ca-cert.cer to the client component Java or OS keystore (if needed) as a trusted CA cert.
  5. Open the MITMproxy Web UI or monitor the command line dashboard
  6. Execute your test on the client host and view the results in the MITMproxy Web UI for both request and response.

MITMproxy UI with WireGuard mode enabled.

The WireGuard client configuration will be provided in three (3) places: the MITMproxy logs (podman logs mitmproxy), the text file wireguard.conf (if podman/docker volumes are enabled), and the MITMproxy UI. The QR code is enabled for mobile phone use, but since the public IP address provided is not correct in this view, you will need to manually edit this configuration on your mobile phone during those use-cases to have the correct endpoint IP address.

MITMproxy UI with standard proxy configuration mode.

Bash Script:

Script to deploy MITMproxy with podman on a linux OS with two (2) configurations: Wireguard mode for any client applications that do not honor HTTP_PROXY/HTTPS_PROXY and Standard proxy mode. This bash script allows a shared volume to use the SAME certs to avoid managing different certs upon restart of the container.

#!/bin/bash
######################################################################################
#
#  Deploy MITMproxy with two (2) configurations:
#
#     MITMProxy with WireGuard mode enabled (UDP 51820) and Web UI (TCP 8081)
#     MITMProxy with standard proxy enabled (TCP 9080) and Web UI (TCP 9081)
#
#  Notes:  Use podman exec to check path and env variables
#    - Binaries:  dnf -y install podman 
#    - Use shared folder to avoid having two (2) different configuration files for both copies
#    - Do not forget the :z for -v volumes to avoid permissions issues
#    - Do not forget quotes around env -e variables
#    - Use --rm as needed
#    - Use this switch as needed, but do not leave it on:   --log-level debug \
#
#   Basic:  podman run -it -v /tmp/mitmproxy/:/home/mitmproxy/.mitmproxy:z -p 8080:8080 mitmproxy/mitmproxy
#   Logs:   podman logs mitmproxy-wireguard
#   Shell:  podman exec -it -u root mitmproxy bash
#
#  Options Ref.  https://docs.mitmproxy.org/stable/concepts-options/
#   - added stream_large_bodies=10m to lower impact to mitmproxy due
#       to possible large json/xml payloads 
#
#  ANA 07/2023
#
######################################################################################

MITMPROXY_HOMEPATH=/tmp/mitmproxy
echo ""
echo "You may delete the shared folder of ${MITMPROXY_HOMEPATH}"
echo "to remove prior configuration of mitmproxy certs & wireguard.conf files"
echo ""
#sudo rm -rf ${MITMPROXY_HOMEPATH}

mkdir -p ${MITMPROXY_HOMEPATH}
chmod -R 777 ${MITMPROXY_HOMEPATH}
ls -hlrt ${MITMPROXY_HOMEPATH}

echo ""
echo " Starting mitmproxy-wireguard proxy "
podman rm mitmproxy-wireguard -f  &>/dev/null
podman run -d -it --name mitmproxy-wireguard \
   -p 51820:51820/udp -p 8081:8081 \
   -l mitmproxy \
   -v ${MITMPROXY_HOMEPATH}:/home/mitmproxy/.mitmproxy:z  \
    docker.io/mitmproxy/mitmproxy \
    mitmweb --mode wireguard --ssl-insecure  --web-host 0.0.0.0 --web-port 8081 --set stream_large_bodies=10m


echo ""
echo " Starting mitmproxy-default proxy "
podman rm mitmproxy-default -f  &>/dev/null
podman run -d -it --name mitmproxy-default \
    -p 9080:9080 -p 9081:9081 \
    -l mitmproxy  \
    -v ${MITMPROXY_HOMEPATH}:/home/mitmproxy/.mitmproxy:z  \
     docker.io/mitmproxy/mitmproxy \
     mitmweb --set listen_port=9080 --web-host 0.0.0.0 --web-port 9081

echo ""
echo ""
echo "###############################################################################"
echo ""
echo " Running Podman Containers for MITMproxy"
sleep 5
podman ps -a --no-trunc | grep -i mitmproxy
echo ""
echo "###############################################################################"
podman logs  mitmproxy-default
echo ""
echo " Monitor the mitmproxy-default UI @ http://$(curl -s ifconfig.me):9081 "
echo "###############################################################################"
podman logs  mitmproxy-wireguard
echo ""
echo " Monitor the mitmproxy-wireguard UI @ http://$(curl -s ifconfig.me):8081 "
echo "###############################################################################"
echo ""
echo "Please update the mitmproxy wireguard client configuration endpoint address to:  $(curl -s ifconfig.me)"
echo ""
echo "###############################################################################"
echo ""

MITMproxy CERTS:

Add mitmproxy-ca-cert to the trusted root certs folder on your client host OS keystore (MS Win: certlm.msc) and/or if there is a java keystore for the client tool, please add the mitmproxy-ca-cert.cer as a trusted cert. keytool -import -trustcacerts -file mitm-ca-proxy.cer -alias mitmproxy -keystore capam.keystore

WireGuard client configuration:

To ensure that only selected web traffic is monitored through wireguard VPN to mitmproxy, make changes to the wireguard.conf file before importing it. Specifically, update the AllowedIPs address field to include a single IP address. Additionally, modify the endpoint field to direct traffic to the public IP address of the mitmproxy host on UDP port 51820. If deploying mitmproxy on AWS or other cloud hosts, confirm that the firewall/security groups permit TCP 8080, 8081, 9080, 9091, and UDP 51820. Once you have activated the WireGuard client, test your processes on the host and monitor the MITMproxy UI for updates.

An example of data captured between two (2) CLI tools. These CLI tools did not honor the OS environmental variables of HTTP_PROXY & HTTPS_PROXY. Using the MITMproxy with WireGuard process, we can now confirm the delta submission behavior that was masked by the CLI tools. This process was useful to confirm that MS Powershell was removing special characters for a password string, e.g. ! (exclamation mark).

Example of script deploying two (2) MITMproxy containers

Adding wildcard certificates to Virtual Appliance

While preparing to enable a feature within the Identity Suite Virtual Appliance for TLS encryption for the Provisioning Tier to send notification events, we noticed some challenges that we wish to clarify.

The Identity Suite Virtual Appliance has four (4) web services that use pre-built self-signed certificates when first deployed. Documentation is provided to change these certificates/key using aliases or soft-links.

One of the challenges we discovered is the Provisioning Tier may be using an older version of libcurl & OpenSSL that have constraints that need to be managed. These libraries are used during the web submission to the IME ETACALLBACK webservice. We will review the processes to capture these error messages and how to address them.

We will introduce the use of Let’s Encrypt wildcard certificates into the four (4) web services and the Provisioning Server’s ETACALLBACK use of a valid public root certificate.

The Apache HTTPD service is used for both a forward proxy (TCP 443) to the three (3) Wildfly Services and service for the vApp Management Console (TCP 10443). The Apache HTTPD service SSL certs use the path /etc/pki/tls/certs/localhost.crt for a self-signed certificate. A soft-link is used to redirect this to a location that the ‘config’ service ID has access to modify. The same is true for the private key.

/etc/pki/tls/certs/localhost.crt -> /opt/CA/VirtualAppliance/custom/apache-ssl-certificates/localhost.crt

/etc/pki/tls/private/localhost.key -> /opt/CA/VirtualAppliance/custom/apache-ssl-certificates/localhost.key

A view of the Apache HTTPD SSL self-signed certificate and key.

The three (3) Wildfly services are deployed for the Identity Manager, Identity Governance and Identity Portal components. The configuration for TLS security is defined within the primary Wildfly configuration file of standalone.xml. The current configuration is already setup with the paths to PKCS12 keystore files of:

/opt/CA/VirtualAppliance/custom/wildfly-ssl-certificates/caim-srv

/opt/CA/VirtualAppliance/custom/wildfly-ssl-certificates/caig-srv

/opt/CA/VirtualAppliance/custom/wildfly-ssl-certificates/caip-srv

A view of the three (3) Wildfly PKCS12 keystore files and view of the self-signed cert/key with the pseudo hostname of the vApp host.

Provisioning Server process for TLS enablement for IME ETACALLBACK process.

Step 1. Ensure that the Provisioning Server is enabled to send data/notification events to the IME.

Step 2. Within the IME Management Console, there is a baseURL parameter. This string is sent down to the Provisioning Server upon restart of the IME, and appended to a list. This list is viewable and manageable within the Provisioning Manager UI under [System/Identity Manager Setup]. The URL string will be appended with the string ETACALLBACK/?env=identityEnv. Within this Provisioning Server, we can manage which URLs have priority in the list. This list is a failover list and not load-balancing. We have the opportunity to introduce an F5 or similar load balancer URL, but we should enable TLS security prior.

Step 3. Added the public root CA Cert or CA chain certs to the following location. [System/Domain Configuration/Identity Manager Server/Trusted CA Bundle]. This PEM file may be placed in the Provisioning Server bin folder with no path or may use a fully qualified path to the PEM file. Note: The Provisioning Server is using a version of openssl/libcurl that will report errors that can be managed with wildcard certificates. We will show the common errors in this blog entry.

Let’sEncrypt https://letsencrypt.org/ Certificates

Let’sEncrypt Certificates offers a free service to build wildcard certificates. We are fond of using their DNS method to request a wildcard certificate.

sudo certbot certonly --manual  --preferred-challenges dns -d *.aks.iam.anapartner.dev --register-unsafely-without-email

Let’s Encrypt will provide four (4) files to be used. [certN.pem, privkeyN.pem, chainN.pem, fullchainN.pem]

cert1.pem   [The primary server side wildcard cert]

privkey1.pem   [The primary server side private key associated with the wildcard cert]

chain1.pem   [The intermediate chain certs that are needed to validate the cert1 cert]

fullchain1.pem    [two files together in the correct order of  cert1.pem and chain1.pem.]  

NOTE:  fullchain1.pem is the file you typically would use as the cert for a solution, so the solution will also have the intermediate CA chain certs for validation]

Important Note: One of the root public certs was cross-signed by another root public cert that expired. Most solutions are able to manage this challenge, but the provisioning service ETACALLBACK has a challenge with an expired certificate, but there are replacements for this expired certificate that we will walk through. Ref: https://letsencrypt.org/docs/dst-root-ca-x3-expiration-september-2021/

Create a new CA chain PEM files for LE (Let’s Encrypt) validation to use with the Provisioning Server.

CERT=lets-encrypt-r3.pem;curl -s -O -L https://letsencrypt.org/certs/$CERT ; openssl x509 -text -noout -in $CERT | grep -i -e issue -e not -e subject ; ls -lart $CERT

CERT=isrgrootx1.pem;curl -s -O -L https://letsencrypt.org/certs/$CERT ; openssl x509 -text -noout -in $CERT | grep -i -e issue -e not -e subject ; ls -lart $CERT

CERT=isrg-root-x2.pem;curl -s -O -L https://letsencrypt.org/certs/$CERT ; openssl x509 -text -noout -in $CERT | grep -i -e issue -e not -e subject ; ls -lart $CERT

cat lets-encrypt-r3.pem isrgrootx1.pem isrg-root-x2.pem > combine-chain-letsencrypt.pem

Replacing the certificates for the vApp Apache, Wildfly (3), and Provisioning Server (ETACALLBACK)

Apache HTTPD Service (TCP 443/10443) (May need to reboot vApp)

cp -r -p  /home/config/aks.iam.anapartner.dev/fullchain2.pem /opt/CA/VirtualAppliance/custom/apache-ssl-certificates/localhost.crt

cp -r -p  /home/config/aks.iam.anapartner.dev/privkey2.pem  /opt/CA/VirtualAppliance/custom/apache-ssl-certificates/localhost.key

Wildfly Services (TCP 8443/8444/84445) for IM, IG, and IP (restart services after update)

View of the Wildfly (Java) services for IM, IG, and IP (restart services after update)
openssl pkcs12 -export -inkey /home/config/aks.iam.anapartner.dev/privkey2.pem -in /home/config/aks.iam.anapartner.dev/fullchain2.pem -out /opt/CA/VirtualAppliance/custom/wildfly-ssl-certificates/caim-srv -password pass:changeit
restart_im

openssl pkcs12 -export -inkey /home/config/aks.iam.anapartner.dev/privkey2.pem -in /home/config/aks.iam.anapartner.dev/fullchain2.pem -out /opt/CA/VirtualAppliance/custom/wildfly-ssl-certificates/caig-srv -password pass:changeit
restart_ig

openssl pkcs12 -export -inkey /home/config/aks.iam.anapartner.dev/privkey2.pem -in /home/config/aks.iam.anapartner.dev/fullchain2.pem -out /opt/CA/VirtualAppliance/custom/wildfly-ssl-certificates/caip-srv -password pass:changeit
restart_ip

Provisioning Server ETACALLBACK public certificate location (restart imps service) [Place in bin folder]

su - imps
cp -r -p /home/config/aks.iam.anapartner.dev/combine-chain-letsencrypt.pem /opt/CA/IdentityManager/ProvisioningServer/bin/
imps stop; imps start

Validation of updated services.

Use openssl s_client to validate certificates being used. Examples below for TCP 443 and 8443

true | openssl s_client -connect vapp143.aks.iam.anapartner.dev:443 -CAfile combine-chain-letsencrypt.pem  | grep "Verify return code"

true | openssl s_client -connect vapp143.aks.iam.anapartner.dev:8443 -CAfile combine-chain-letsencrypt.pem  | grep "Verify return code"

To view all certs in the chain, use the below openssl s_client command with -showcerts switch:

true | openssl s_client -connect vapp143.aks.iam.anapartner.dev:443 -CAfile combine-chain-letsencrypt.pem  -showcerts

true | openssl s_client -connect vapp143.aks.iam.anapartner.dev:8443 -CAfile combine-chain-letsencrypt.pem  -showcerts

Validate with browsers and view the HTTPS lock symbol to view the certificate

Test with an update to a Provisioning Global User’s attribute [Note: No need to sync to accounts]. Ensure that the Identity Manager Setup Log Level = DEBUG to monitor this submission with the Provisioning Server etanotifyXXXXXXX.log.

A view of the submission for updating the Global User’s Description via IMPS (IM Provisioning Server) etanotifyXXXXXXX.log. The configuration will be loaded for using the URLs defined. Then we can monitor for the submission of the update.

Finally, a view using the IME VST (View Submitted Tasks) for the ETACALLBACK process using the task Provisioning Modify User.

Common TLS errors seen with the Provisioning Server ETACALLBACK

Ensure that the configuration is enabled for debug log level, so we may view these errors to correct them. [rc=77] will occur if the PEM file does not exist or is not in the correct path. [rc=51] will occur if the URL defined does not match the exact server-side certificate (this is a good reason to use a wildcard certificate or adjust your URL FQDN to match the cert subject (CN=XXXX) value. [rc=60] will occur if the remote web service is using a self-signed certificate or if the certificate has any expiration dates within the certificate or chain or the public root CA cert.

Other Error messages (curl)

If you see an error message with Apache HTTPD (TCP 443) with curl about “curl: (60) Peer certificate cannot be authenticated with known CA certificates”, please ignore this, as the vApp does not have the “ca-bundle.crt” configuration enabled. See RedHat note: https://access.redhat.com/solutions/523823

References

https://knowledge.broadcom.com/external/article?articleId=54198
https://community.broadcom.com/HigherLogic/System/DownloadDocumentFile.ashx?DocumentFileKey=849ea21f-cc5a-4eac-9988-465a75165cf1
https://curl.se/libcurl/c/libcurl-env.html
https://knowledge.broadcom.com/external/article/204213/how-to-setup-inbound-notifications-to-us.html
https://knowledge.broadcom.com/external/article/213480/how-to-replace-the-vapp-wildfly-ssl-cert.html https://www.stephenwagner.com/2021/09/30/sophos-dst-root-ca-x3-expiration-problems-fix/

OpenShift on VMware Workstation

RedHat OpenShift is one of the container orchestration platforms that provides an enterprise-grade solution for deploying, running, and managing applications on public, on-premise, or hybrid cloud environments.

This blog entry outlines the high-level architecture of a LAB OpenShift on-prem cloud environment built on VMware Workstation infrastructure.

Red Hat OpenShift and the customized ISO image with Red Hat Core OS provide a straightforward process to build your lab and can help lower the training cost. You may watch the end-to-end process in the video below or follow this blog entry to understand the overall process.

Requirements:

  • Red Hat Developer Account w/ Red Hat Developer Subscription for Individuals
  • Local DNS to resolve a minimum of three (3) addresses for OpenShift. (api.[domain], api-int.[domain], *.apps.[domain])
  • DHCP Server (may use VMware Workstation NAT’s DHCP)
  • Storage (recommend using NFS for on-prem deployment/lab) for OpenShift logging/monitoring & any db/dir data to be retained.
  • SSH Terminal Program w/ SSH Key.
  • Browser(s)
  • Front Loader/Load Balancer (HAProxy)
  • VMware Workstation Pro 16.x
  • Specs: (We used more than the minimum recommended by OpenShift to prepare for other applications)
    • Three (3) Control Planes Nodes @ 8 vCPU/16 GB RAM/100 GB HDD with “Red Hat Enterprise Linux 8 x64 bit” Guest OS Type
    • Four (4) Worker Nodes @ 4 vCPU/16 GB RAM/100 GB HDD with “Red Hat Enterprise Linux 8 x64” Guest OS Type

Post-Efforts: Apply these to provide additional value. [Included as examples]

  • Add entropy service (haveged) to all nodes/pods to increase security & performance.
  • Let’sEncrypt wild card certs for *.[DOMAIN] and *.apps.[DOMAIN] to avoid self-signed certs for external UIs. Avoid using “thisisunsafe” within the Chrome browser to access the local OpenShift console.
  • Update OpenShift Ingress to be aware of more than two (2) worker nodes.
  • Update OpenShift to use NFS as default storage.

Below is a view of our footprint to deploy the OpenShift 4.x environment on a local data center hosted by VMware Workstation.

Red Hat OpenShift provides three (3) options to deploy. Cloud, Datacenter, Local. Local is similar to minikube for your laptop/workstation with a few pods. Red Hat OpenShift license for Cloud requires deployment on other vendors’ sites for the nodes (cpu/ram/disk) and load balancers. If you deploy OpenShift on AWS and GCP, plan a budget of $500/mo per resource for the assets.

After reviewing the open-source OKD solution and the various OpenShift deployment methods, we selected the “DataCenter” option within OpenShift. Two (2) points made this decision easy.

  • Red Hat OpenShift offers a sixty (60) day eval license.
    • This license can be restarted for another sixty (60) days if you delete/archive the last cluster.
  • Red Hat OpenShift provides a customized ISO image with Red Hat Core OS, ignition yaml files, and an embedded SSH Public Key, that does a lot of the heavy lifting for setting up the cluster.

The below screen showcases the process that Red Hat uses to build a bootstrap ISO image using Red Hat Core OS, Ignition yaml files (to determine node type of control plane/worker node), and the embedded SSH Key. This process provides a lot of value to building a cluster and streamlines the effort.

DNS Requirement

The minimal DNS entries required for OpenShift is three (3) addresses.

api.[domain]

api-int.[domain]

*.apps.[domain]

https://docs.openshift.com/container-platform/4.11/installing/installing_platform_agnostic/installing-platform-agnostic.html

Front Load Balancer (HAProxy)

Update HAproxy.cfg as needed for IP addresses / Ports. To avoid deployment of HAProxy twice, we use the “bind” command to join two (2) HAproxy configuration files together to prevent conflict on port 80/443 redirect for both OpenShift and another application deployed on OpenShift.

# Global settings
# Set $IP_RANGE as an OS ENV or Global variable before running HAPROXY
#   Important: If using VMworkstation NAT ensure this range is correctly defined to
#   avoid error message with x509 error on port 22623 upon startup on control planes
#
#   Ensure 3XXXX PORT is defined correct from the ingress
#    - We have predefined these ports to 32080 and 32443 for helm deployment of ingress
#    oc -n ingress get svc
#
#---------------------------------------------------------------------
global
    setenv IP_RANGE 192.168.243
    setenv HA_BIND_IP1 192.168.2.101
    setenv HA_BIND_IP2 192.168.2.111
    maxconn     20000
    log         /dev/log local0 info
    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    user        haproxy
    group       haproxy
    daemon

    # turn on stats unix socket
    stats socket /var/lib/haproxy/stats

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    log                     global
    mode                    http
    option                  httplog
    option                  dontlognull
    option                  http-server-close
    option                  redispatch
    option forwardfor       except 127.0.0.0/8
    retries                 3
    maxconn                 20000
    timeout http-request    10000ms
    timeout http-keep-alive 10000ms
    timeout check           10000ms
    timeout connect         40000ms
    timeout client          300000ms
    timeout server          300000ms
    timeout queue           50000ms

# Enable HAProxy stats
# Important Note:  Patch OpenShift Ingress to allow internal RHEL CoreOS haproxy to run on additional worker nodes
#  oc patch -n openshift-ingress-operator ingresscontroller/default --patch '{"spec":{"replicas": 7}}' --type=merge
#
listen stats
    bind :9000
    stats uri /
    stats refresh 10000ms

# Kube API Server
frontend k8s_api_frontend
    bind :6443
    default_backend k8s_api_backend
    mode tcp
    option tcplog

backend k8s_api_backend
    mode tcp
    balance source
    server      ocp-cp-1_6443        "$IP_RANGE".128:6443 check
    server      ocp-cp-2_6443        "$IP_RANGE".129:6443 check
    server      ocp-cp-3_6443        "$IP_RANGE".130:6443 check

# OCP Machine Config Server
frontend ocp_machine_config_server_frontend
    mode tcp
    bind :22623
    default_backend ocp_machine_config_server_backend
    option tcplog

backend ocp_machine_config_server_backend
    mode tcp
    balance source
    server      ocp-cp-1_22623        "$IP_RANGE".128:22623 check
    server      ocp-cp-2_22623        "$IP_RANGE".129:22623 check
    server      ocp-cp-3_22623        "$IP_RANGE".130:22623 check

# OCP Machine Config Server #2
frontend ocp_machine_config_server_frontend2
    mode tcp
    bind :22624
    default_backend ocp_machine_config_server_backend2
    option tcplog

backend ocp_machine_config_server_backend2
    mode tcp
    balance source
    server      ocp-cp-1_22624        "$IP_RANGE".128:22624 check
    server      ocp-cp-2_22624        "$IP_RANGE".129:22624 check
    server      ocp-cp-3_22624        "$IP_RANGE".130:22624 check


# OCP Ingress - layer 4 tcp mode for each. Ingress Controller will handle layer 7.
frontend ocp_http_ingress_frontend
    bind "$HA_BIND_IP1":80
    default_backend ocp_http_ingress_backend
    mode tcp
    option tcplog

backend ocp_http_ingress_backend
    balance source
    mode tcp
    server      ocp-w-1_80     "$IP_RANGE".131:80 check
    server      ocp-w-2_80     "$IP_RANGE".132:80 check
    server      ocp-w-3_80     "$IP_RANGE".133:80 check
    server      ocp-w-4_80     "$IP_RANGE".134:80 check
    server      ocp-w-5_80     "$IP_RANGE".135:80 check
    server      ocp-w-6_80     "$IP_RANGE".136:80 check
    server      ocp-w-7_80     "$IP_RANGE".137:80 check

frontend ocp_https_ingress_frontend
    bind "$HA_BIND_IP1":443
    default_backend ocp_https_ingress_backend
    mode tcp
    option tcplog

backend ocp_https_ingress_backend
    mode tcp
    balance source
    server      ocp-w-1_443     "$IP_RANGE".131:443 check
    server      ocp-w-2_443     "$IP_RANGE".132:443 check
    server      ocp-w-3_443     "$IP_RANGE".133:443 check
    server      ocp-w-4_443     "$IP_RANGE".134:443 check
    server      ocp-w-5_443     "$IP_RANGE".135:443 check
    server      ocp-w-6_443     "$IP_RANGE".136:443 check
    server      ocp-w-7_443     "$IP_RANGE".137:443 check

######################################################################################

# VIPAUTHHUB Ingress
frontend vip_http_ingress_frontend
    bind "$HA_BIND_IP2":80
    mode tcp
    option forwardfor
    option http-server-close
    default_backend vip_http_ingress_backend

backend vip_http_ingress_backend
    mode tcp
    balance roundrobin
    server      vip-w-1_32080     "$IP_RANGE".131:32080 check fall 3 rise 2 send-proxy-v2
    server      vip-w-2_32080     "$IP_RANGE".132:32080 check fall 3 rise 2 send-proxy-v2
    server      vip-w-3_32080     "$IP_RANGE".133:32080 check fall 3 rise 2 send-proxy-v2
    server      vip-w-4_32080     "$IP_RANGE".134:32080 check fall 3 rise 2 send-proxy-v2
    server      vip-w-5_32080     "$IP_RANGE".135:32080 check fall 3 rise 2 send-proxy-v2
    server      vip-w-6_32080     "$IP_RANGE".136:32080 check fall 3 rise 2 send-proxy-v2
    server      vip-w-7_32080     "$IP_RANGE".137:32080 check fall 3 rise 2 send-proxy-v2

frontend vip_https_ingress_frontend
    bind "$HA_BIND_IP2":443
    # mgmt-sspfqdn
    acl is_mgmt_ssp hdr_end(host) -i mgmt-ssp.okd.anapartner.dev
    use_backend vip_ingress-nodes_mgmt-nodeport if is_mgmt_ssp
    mode tcp
    #option forwardfor
    option http-server-close
    default_backend vip_https_ingress_backend

backend vip_https_ingress_backend
    mode tcp
    balance roundrobin
    server      vip-w-1_32443     "$IP_RANGE".131:32443 check fall 3 rise 2 send-proxy-v2
    server      vip-w-2_32443     "$IP_RANGE".132:32443 check fall 3 rise 2 send-proxy-v2
    server      vip-w-3_32443     "$IP_RANGE".133:32443 check fall 3 rise 2 send-proxy-v2
    server      vip-w-4_32443     "$IP_RANGE".134:32443 check fall 3 rise 2 send-proxy-v2
    server      vip-w-5_32443     "$IP_RANGE".135:32443 check fall 3 rise 2 send-proxy-v2
    server      vip-w-6_32443     "$IP_RANGE".136:32443 check fall 3 rise 2 send-proxy-v2
    server      vip-w-7_32443     "$IP_RANGE".137:32443 check fall 3 rise 2 send-proxy-v2

backend vip_ingress-nodes_mgmt-nodeport
    mode tcp
    balance roundrobin
    server      vip-w-1_32443     "$IP_RANGE".131:32443 check fall 3 rise 2 send-proxy-v2
    server      vip-w-2_32443     "$IP_RANGE".132:32443 check fall 3 rise 2 send-proxy-v2
    server      vip-w-3_32443     "$IP_RANGE".133:32443 check fall 3 rise 2 send-proxy-v2
    server      vip-w-4_32443     "$IP_RANGE".134:32443 check fall 3 rise 2 send-proxy-v2
    server      vip-w-5_32443     "$IP_RANGE".135:32443 check fall 3 rise 2 send-proxy-v2
    server      vip-w-6_32443     "$IP_RANGE".136:32443 check fall 3 rise 2 send-proxy-v2
    server      vip-w-7_32443     "$IP_RANGE".137:32443 check fall 3 rise 2 send-proxy-v2

######################################################################################

Use the following commands to add 2nd IP address to one NIC on the main VMware Workstation Host, where NIC = eno1 and 2nd IP address = 192.168.2.111

nmcli dev show eno1
sudo nmcli dev mod eno1 +ipv4.address 192.168.2.111/24

VMware Workstation Hosts / Nodes

When building the VMware hosts, ensure that you use Guest Type “Red Hat Enterprise Linux 8 x64” to match the embedded Red Hat Core OS provided in an ISO image. Otherwise, DHCP services may not work correctly, and when the VMware host boots, it may not receive an IP address.

The VMware hosts for Control Planes Nodes are recommended to be 8 vCPU, 16 GB RAM, and 100 HDD. The VMware hosts for Worker Nodes are recommended to be 4 vCPU, 16 GB RAM, and 100 HDD.
OpenShift requires a minimum of three (3) Control Plane Nodes and two (2) Worker Nodes. Please check with any solution you may deploy and adjust the parameters as needed. We will deploy four (4) Worker Nodes for Symantec VIP Auth Hub solution. And horizontally scale the solution with more worker nodes for Symantec API Manager and Siteminder.

Before starting any of these images, create a local snapshot as a “before” state. This will allow you to redeploy with minimal impact if there is any issue.

Before starting the deployment, you may wish to create a new NAT VMware Network, to avoid impacting any existing VMware images on the same address range. We will be adjusting the dhcpd.conf and dhcpd.leases files for this network.

To avoid an issue with reverse DNS lookup within PODS and Containers, remove a default value from dhcpd.conf. Stop vmware network, remove or comment out the line “option domain-name localdomain;” , remove any dhcpd.leases information, then restart the vmware network.

ls -lart /etc/vmware/vmnet8/dhcpd/dhcpd.leases ; echo ""
sudo /usr/bin/vmware-networks --stop ; echo ""
sudo cp /dev/null /etc/vmware/vmnet8/dhcpd/dhcpd.leases ; echo ""
ls -lart /etc/vmware/vmnet8/dhcpd/dhcpd.leases ; echo ""
cat      /etc/vmware/vmnet8/dhcpd/dhcpd.leases ; echo ""
sudo /usr/bin/vmware-networks --start ; echo ""
ls -lart /etc/vmware/vmnet8/dhcpd/dhcpd.leases ; echo ""
cat      /etc/vmware/vmnet8/dhcpd/dhcpd.leases ; echo ""

OpenShift / Kubernetes / Helm Command Line Binaries

Download these two (2) client packages to have three (3) binaries for interfacing with OpenShift/Kubernetes API Server.

Download Openshift Binaries for remote management (on main host)
#########################
sudo su -
cd /tmp/openshift
curl -skOL https://mirror.openshift.com/pub/openshift-v4/clients/helm/latest/helm-linux-amd64.tar.gz ; tar -zxvf helm-linux-amd64.tar.gz
curl -skOL https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/stable/openshift-client-linux.tar.gz ; tar -zxvf openshift-client-linux.tar.gz
mv -f oc /usr/bin/oc
mv -f kubectl /usr/bin/kubectl
mv -f helm-linux-amd64 /usr/local/bin/helm
oc version
helm version
kubectl version

Start an OpenShift Cluster Deployment

OpenID Configuration with OpenShift

Post-deployment step: After you have deployed OpenShift cluster, you will be asked to create an IDP to authenticate other accounts. Below is an example with OpenShift and MS Azure. The image below showcases the parameters and values to be shared between the two solutions.

Entropy DaemonSet for OpenShift Nodes/Pods

We can validate the entropy on an OpenShift nodes or Pod via use of /dev/random. We prefer to emulate a 1000 password changes that showcase how rapidly the entropy pool of 4K is depleted when a security process accesses it. Example of the single line bash code.

Validate Entropy in Openshift Nodes [Before/After use of Haveged Deployment]
#########################
(counter=1;MAX=1001;time while [ $counter -le $MAX ]; do echo "";echo "##########  $counter ##########" ; echo "Entropy = `cat /proc/sys/kernel/random/entropy_avail`  out of 4096"; echo "" ; time dd if=/dev/random bs=8 count=1 2>/dev/null | base64; counter=$(( $counter + 1 )); done;)

To deploy an entropy daemonset, we can leverage what is documented by Broadcom/Symantec in their VIP Auth Hub documentation. https://techdocs.broadcom.com/us/en/symantec-security-software/identity-security/vip-authentication-hub/2022-Oct/operating/troubleshooting/checking-entropy-level.html#concept.dita_d3303fde-e786-4fd4-b0b6-e3a28fd60a82

$ cat <<EOF > | kubectl apply -f -
apiVersion: apps/v1
kind: DaemonSet
metadata:
  namespace: kube-system
  labels:
    run: haveged
  name: haveged
spec:
  selector:
    matchLabels:
      run: haveged
  template:
    metadata:
      labels:
        run: haveged
    spec:
      containers:
      - image: hortonworks/haveged:1.1.0
        name: haveged
        securityContext:
          privileged: true
      tolerations:
      - effect: NoSchedule
        operator: Exists
EOF

Patch OpenShift Workers

If the number of OpenShift Workers is greater than two (2), then you will need to patch the OpenShift Ingress controller to scale up to the number of worker nodes.

WORKERS=`oc get nodes | grep worker | wc -l`

echo ""
echo "######################################################################"
echo "# of Worker replicas in OpenShift Ingress Prior to update"
echo "oc get -n openshift-ingress-operator ingresscontroller -o yaml | grep -i replicas:"
#echo "######################################################################"
echo ""
oc patch -n openshift-ingress-operator ingresscontroller/default --patch "{\"spec\":{\"replicas\": ${WORKERS}}}" --type=merge

LetsEncrypt Certs for OpenShift Ingress and API Server

The certs with OpenShift are self-signed. This is not an issue until you attempt to access the local OpenShift console with a browser and are stopped from accessing the UI by newer security enforcement in the browsers. To avoid this challenge, we recommend switching the certs to LetsEncrypt. There are many examples how to rotate the certs. We used the below link to rotate the certs. https://docs.openshift.com/container-platform/4.12/security/certificates/replacing-default-ingress-certificate.html

echo "Installing ConfigMap for the Default Ingress Controllers"
oc delete configmap letsencrypt-fullchain-ca -n  openshift-config &>/dev/null
oc create configmap letsencrypt-fullchain-ca \
     --from-file=ca-bundle.crt=${CHAINFILE} \
     -n openshift-config

oc patch proxy/cluster \
     --type=merge \
     --patch='{"spec":{"trustedCA":{"name":"letsencrypt-fullchain-ca"}}}'

echo "Installing Certificates for the Default Ingress Controllers"
oc delete secret letsencrypt-certs -n openshift-ingress &>/dev/null
oc create secret tls letsencrypt-certs \
  --cert=${CHAINFILE} \
  --key=${KEYFILE} \
  -n openshift-ingress


echo "Backup prior version of ingresscontroller"
oc get ingresscontroller default -n openshift-ingress-operator -o yaml > /tmp/ingresscontroller.$DATE.yaml
oc patch ingresscontroller.operator default -n openshift-ingress-operator --type=merge --patch='{"spec": { "defaultCertificate": { "name": "letsencrypt-certs" }}}'


echo "Installing Certificates for the API Endpoint"
oc delete secret letsencrypt-certs  -n openshift-config  &>/dev/null
oc create secret tls letsencrypt-certs \
  --cert=${CHAINFILE} \
  --key=${KEYFILE} \
  -n openshift-config

echo "Backup prior version of apiserver"
oc get apiserver cluster -o yaml > /tmp/apiserver_cluster.$DATE.yaml
oc patch apiserver cluster --type merge --patch="{\"spec\": {\"servingCerts\": {\"namedCertificates\": [ { \"names\": [  \"$LE_API\"  ], \"servingCertificate\": {\"name\": \"letsencrypt-certs\" }}]}}}"

echo "#####################################################################################"
echo "true | openssl s_client -connect api.${DOMAIN}:443 --showcerts --servername api.${DOMAIN}"
echo ""


echo "It may take 5-10 minutes for the OpenShift Ingress/API Pods to cycle with the new certs"
echo "You may monitor with:  watch -n 2 'oc get pod -A | grep -i -v -e running -e complete'  "
echo ""
echo "Per Openshift documentation use the below command to monitor the state of the API server"
echo "ensure PROGRESSING column states False as the status before continuing with deployment"
echo ""
echo "oc get clusteroperators kube-apiserver "

Please reach out if you wish to learn more or have ANA assist with Kubernetes / OpenShift opportunities.

Kubernetes and Vmware Workstation

Kubernetes was designed for the deployment of applications to cloud architecture with containers. Another way of thinking about Kubernetes; it gets us “out-of-the-install-binaries” business and focuses our efforts on the business value of a solution. We have documented our process of how we train our resources and partners. This process will help your team to excel and gain confidence with cloud technologies.

One of the business challenges of Kubernetes in the cloud architecture is the ongoing cost ($300-$600/month per resource) during the learning or development process. To lower this ongoing cost per resource, we focused on a method to use on-prem Kubernetes deployments.


We have found examples online of using minikube and Oracle Virtualbox to assist with keeping costs low while using an on-prem deployment but did not find many examples of using Vmware Workstation to our satisfaction. Our goal was to utilize a solution that we are very familiar with and has the supporting capabilities for rollback via snapshots.

We have used Vmware Workstation for many years while working on service projects. We cannot overstate its usefulness to offer a “play-ground” and development environment independent of a client’s environment. The features of snapshots allow for negative use-case testing or “what-if” scenarios to destroy or impact solutions being tested with minimal impact.

In this entry, we will discuss the use of Vmware Workstation and CentOS (or Ubuntu) as the primary Kubernetes Nodes. Both CentOS and/or Ubuntu OS are used by the cloud providers as their Kubernetes nodes, so this on-prem process will translate well.

Some of our team members run the Kubernetes environment from their laptop, a collection of individual servers, or a larger server that may scale to the number of vCPU/RAM required for the Kubernetes solution.

Decision 1: Choose an OS to be used.

Either CentOS or Ubuntu OS is acceptable to use for on-prem. When we checked the OSes used by the cloud providers, we noted they used one of these two (2) OS for Linux OS. We decided on CentOS 7, as iptables for routing are used within Kubernetes; and iptables are used by default in CentOS 7. You may find that other OSes will work fine as well.

Decision 2: Build a reference image

Identify all expected binaries to be used within this image. This reference image will be cloned for the Kubernetes control plane node (1) and the worker nodes (3-4). We will also use this image to build a supporting node (non-Kubernetes) for SiteMinder integration and a docker repository for the Kubernetes docker images. For a total of six (6) nodes.

Decision 3: DNS and Certificates

Recommendation: Please do not attempt to deploy a Kubernetes solution on-prem without having purchased a DNS domain/site and use wild card certificates tied to the DNS domain.

Without these two (2) supporting components, it is a challenge to have a working Kubernetes solution that reflects what you will experience in a cloud deployment.

For example, we purchased a domain for $12/year, and then created several “A” records that will host the IP addresses we may use to redirect to cloud or on-prem. Using sub-domains “A” records, we can have as many cloud addresses as we wish.

DNS "A" Records Example:    
aks.iam.anapartner.net (MS Azure),  
eks.iam.anapartner.net (Amazon),  
gke.iam.anapartner.net (Google).      

DNS "CNAME" Records Example:  
alertmanager.aks.iam.anapartner.net, 
grafana.aks.iam.anapartner.net, 
jaeger.aks.iam.anapartner.net,
kibana.aks.iam.anapartner.net, 
mgmt-ssp.aks.iam.anapartner.net, 
sm.aks.iam.anapartner.net, 
ssp.aks.iam.anapartner.net.       
Example of using Synology DNS Server for Kubernetes cluster’s application. With “A” and “CNAME” records.

Finally, we prefer to use wildcard certificates for these domains to avoid challenges within our Kubernetes deployment. There are several services out there offering free certificates.

We chose Let’sEncrypt https://letsencrypt.org/. While Let’sEncrypt has automated processes to renew their certs, we chose to use their DNS validation process with a CertBot solution. We can renew these certificates every 90 days for on-prem usage. The DNS validation process requires a unique string generated by the Let’sEncrypt process to be populated in a DNS “TXT” record like so: _acme-challenge.aks.iam.anapartner.net . See the example at the bottom of this blog entry on this process.

Decision 4: Supporting Components: Storage, Load-Balancing, DNS Resolution (Local)

The last step required for on-prem deployment is where will you decide to place persistence storage for your Kubernetes cluster. We chose to use an NFS share.

We first tested using the control-plane node, then decided to move the NFS share to a Synology NAS solution. Similar for the DNS resolution option, at first we used a DNS service on the control-plane node and then moved to to the Synology NAS solution.

For Load-Balancing, Kubernetes has a service option of NodePort and LoadBalancing. The LoadBalancing service if not deployed in the cloud, will default to NodePort behavior. To introduce load balancing for on-prem, we introduced the HA-proxy service on the control-plane node, along with Kubernetes NodePort service to meet this goal.

After the decisions have been made, we can now walk through the steps to set up a Vmware environment for Kubernetes.

Reference Image

Step 1: Download the OS DVD ISO image for deployment on Vmware Workstation (Centos 7 / Ubuntu ).

Determine specs for the future solution to be deployed on Kubernetes. Some solutions have pods that may require minimal memory/disc space. For the solution we decided on deploying, we confirmed that we need 16 GB RAM and 4vCPU minimal. We have confirmed these specs were required by previously deploying the solution in a cloud environment.

Without these memory/cpu specs, the solution that we chose would pause the deployment of Kubernetes pods to the nodes. You may or may not see error messages in the deployment of pods stating that the nodes did not have enough resources for all or some of the pods.

For disc size, we selected 100 GB to future-proof the solution during testing. For networking, please select BRIDGED mode, to allow the Vmware images to have minimal network issues when routing within your local network. Please avoid double NAT’ing the deployment to reduce your headaches.

Step 2: Install useful base packages and disable any UI tools. Please install an Entropy Daemon to avoid delays due to certificates usage of /dev/random and low entropy.

### UI Update for CentOS7 was stopping yum deployment - not required for our solution to be tested (e.g. VIP Auth Hub)
# su to root to run the below commands.   We will add sudo access later.

su - 
systemctl disable packagekit; systemctl stop packagekit; systemctl status packagekit

### Installed base useful packages.

yum -y install dnf epel-release yum-utils nfs-utils 

### Install useful 2nd tools.

yum -y install openldap-clients jq python3-pip tree

pip3 install yq
yum -y upgrade


### Install Entropy process (epel repo)

dnf -y install haveged
systemctl enable haveged --now

Step 3: Install docker and update the docker configuration for use with Kubernetes. Update the path & storage-driver for the docker images for initial deployment.

Ref: https://docs.docker.com/storage/storagedriver/overlayfs-driver/

### Install Docker repo & docker package

yum-config-manager --add-repo  https://download.docker.com/linux/centos/docker-ce.repo
dnf -y install docker-ce
docker version
systemctl enable docker --now
docker version

### Update docker image info after deployment and restart service

cat << EOF > /etc/docker/daemon.json
{
"debug": false,
"data-root": "/home/docker-images",
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}
EOF

### Restart docker to load updated image info.
systemctl restart docker; systemctl status docker; docker version

Step 4: Deploy the three (3) primary Kubernetes & the HELM binaries.

Ensure you select a Kubernetes version that matches what solution you wish to deploy and work with. This can be a gotcha if the Kubernetes binaries update during a dnf / yum upgrade process and your solution has not been vetted for the newer release of Kubernetes. See the reference link below on how to upgrade Kubernetes binaries.

Ref: https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/

### Add k8s repo

cat << EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
exclude=kubelet kubeadm kubectl
EOF

### When upgrading the OS, be sure to use the correct version of kubernetes (remove and add) - Example to force version 1.20.11 ###

dnf upgrade -y
dnf remove -y kubelet kubeadm kubectl
dnf install -y kubelet-1.20.11-0.x86_64 kubeadm-1.20.11-0.x86_64 kubectl-1.20.11-0.x86_64 --disableexcludes=kubernetes


### Start the k8s process.

systemctl enable kubelet --now;  systemctl status kubelet
systemctl daemon-reload && systemctl enable kubelet --now
yum-config-manager --save --setopt=kubernetes.skip_if_unavailable=true

### Add HELM binary 

curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3
chmod 700 get_helm.sh
./get_helm.sh

Step 5: OS configurations required or useful for Kubernetes. Kubernetes kubelet binary requires SWAP to be disabled.

Ref: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

### Stop FirewallD - May add ports later for security

systemctl stop firewalld;systemctl disable firewalld; iptables -F

### Update OS Parameters for kubernetes

setenforce 0
sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux
modprobe br_netfilter

cat << EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system

### Note:  IP forwarding is enabled by default.

sysctl -a | grep -i forward

### Note: Update /etc/fstab to comment out swap line with # character
### Warning:  kubectl init will fail if swap is left on cp or any worker node.

swapoff -a
sed -i 's|UUID\=\(.*\)-\(.*\)-\(.*\)-\(.*\)-\(.*\) swap|#UUID\=\1-\2-\3-\4-\5 swap|g' /etc/fstab
cat /etc/fstab

Step 6: Create SSH key for root or other services IDs to allow remote script updates from CP to Worker Nodes

### Create SSH key for root to allow remote script updates from CP to Worker Nodes - Enter a Blank/Null PASSWORD.

su - 
rm -rf ~/.ssh; echo y | ssh-keygen -b 4096  -C $USER -f ~/.ssh/id_rsa

### Copy the public rsa key to authorized keys to avoid password between cp/worker nodes for remote ssh commands.

cp -r -p ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys;chmod 600 ~/.ssh/authorized_keys;ls -lart .ssh

### Test for remote connection with no password:   
  
ssh -i ~/.ssh/id_rsa  root@localhost    

### Copy the id_rsa key to your host system for ease of testing.

### Add your local non-root user to sudo wheel group.  Change vip to your user ID.

LOCALUSER=vip
gpasswd -a $LOCALUSER wheel

### Update sudoers file to allow wheel group with no-password

sed -i 's|# %wheel|%wheel|g' /etc/sudoers

###  View update wheel group.

grep "%wheel" /etc/sudoers

# Example of return query.
# %wheel  ALL=(ALL)       ALL
# %wheel  ALL=(ALL)       NOPASSWD: ALL

Step 7: Stop or adjust the OS network manager, shutdown the reference image, and create a Vmware Snapshot

### Adjust or Disable the OS NetworkManager (to avoid overwriting /etc/resolv.conf)
### Important when using an internal DNS server.

systemctl disable NetworkManager;systemctl stop NetworkManager

### reboot CentOS7 Image and validate no issues upon reboot.
reboot

### Shutdown image and manually create snapshot called  "base"

Vmware Workstation Cloning

Step 8: Now that we have a reference image, we can now make clone images for the control-plane (1), the worker nodes (4), and the supporting node (1). This is a fairly quick process.

export BASE=/home/me/vmware/kub
export REF=/home/me/vmware/kub/CentOS7/CentOS7.vmx

VM=cp;mkdir       -p $BASE/$VM; time vmrun -T ws clone $REF $BASE/$VM/$VM.vmx -cloneName=$VM -snapshot=base full
VM=worker01;mkdir -p $BASE/$VM; time vmrun -T ws clone $REF $BASE/$VM/$VM.vmx -cloneName=$VM -snapshot=base full
VM=worker02;mkdir -p $BASE/$VM; time vmrun -T ws clone $REF $BASE/$VM/$VM.vmx -cloneName=$VM -snapshot=base full
VM=worker03;mkdir -p $BASE/$VM; time vmrun -T ws clone $REF $BASE/$VM/$VM.vmx -cloneName=$VM -snapshot=base full
VM=worker04;mkdir -p $BASE/$VM; time vmrun -T ws clone $REF $BASE/$VM/$VM.vmx -cloneName=$VM -snapshot=base full
VM=sm;mkdir -p $BASE/$VM; time vmrun -T ws clone $REF $BASE/$VM/$VM.vmx -cloneName=$VM -snapshot=base full

Step 9: Start the clone images and remotely assign new hostname/IP addresses to the images

# Start cloned images for CP and Worker Nodes - Update any files as needed. 
 
export DOMAIN=aks.iam.anapartner.net
export PASSWORD_VM=Password01

### Start the cloned images for CP and Worker Nodes.

VM=cp;vmrun -T ws start $BASE/$VM/$VM.vmx nogui
VM=worker01;vmrun -T ws start $BASE/$VM/$VM.vmx nogui
VM=worker02;vmrun -T ws start $BASE/$VM/$VM.vmx nogui
VM=worker03;vmrun -T ws start $BASE/$VM/$VM.vmx nogui
VM=worker04;vmrun -T ws start $BASE/$VM/$VM.vmx nogui
VM=sm;vmrun -T ws start $BASE/$VM/$VM.vmx nogui
vmrun -T ws list | sort -rn


### Update Hostnames for CP and Worker Nodes with Domain.

VM=cp;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash "hostnamectl set-hostname $VM.$DOMAIN" -noWait
VM=worker01;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash "hostnamectl set-hostname $VM.$DOMAIN" -noWait
VM=worker02;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash "hostnamectl set-hostname $VM.$DOMAIN" -noWait
VM=worker03;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash "hostnamectl set-hostname $VM.$DOMAIN" -noWait
VM=worker04;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash "hostnamectl set-hostname $VM.$DOMAIN" -noWait
VM=sm;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash "hostnamectl set-hostname $VM.$DOMAIN" -noWait


### Update IP Address and Domain for NIC (ifcfg-ens33)

export CP=192.168.2.60
export WK1=192.168.2.61
export WK2=192.168.2.62
export WK3=192.168.2.63
export WK4=192.168.2.64
export SM=192.168.2.65

VM=cp;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "sed -i 's|TYPE=\"Ethernet\"|TYPE=\"Ethernet\"\nIPADDR=$CP\nDOMAIN=$DOMAIN|g'   /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=worker01;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "sed -i 's|TYPE=\"Ethernet\"|TYPE=\"Ethernet\"\nIPADDR=$WK1\nDOMAIN=$DOMAIN|g'   /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=worker02;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "sed -i 's|TYPE=\"Ethernet\"|TYPE=\"Ethernet\"\nIPADDR=$WK2\nDOMAIN=$DOMAIN|g'   /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=worker03;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "sed -i 's|TYPE=\"Ethernet\"|TYPE=\"Ethernet\"\nIPADDR=$WK3\nDOMAIN=$DOMAIN|g'   /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=worker04;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "sed -i 's|TYPE=\"Ethernet\"|TYPE=\"Ethernet\"\nIPADDR=$WK4\nDOMAIN=$DOMAIN|g'   /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=sm;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "sed -i 's|TYPE=\"Ethernet\"|TYPE=\"Ethernet\"\nIPADDR=$SM\nDOMAIN=$DOMAIN|g'   /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait

Step 10: Enable the network gateway, disable DHCP, and reboot the images

export DOMAIN=aks.iam.anapartner.net
export PASSWORD_VM=Password01

### Update to create a new default GATEWAY HOST to address routing issues to external IP addresses.
GATEWAY=192.168.2.1

VM=cp;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "sed -i 's|# Created by anaconda|# Created by anaconda\nGATEWAY=$GATEWAY|g' /etc/sysconfig/network" -noWait
VM=worker01;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "sed -i 's|# Created by anaconda|# Created by anaconda\nGATEWAY=$GATEWAY|g' /etc/sysconfig/network" -noWait
VM=worker02;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "sed -i 's|# Created by anaconda|# Created by anaconda\nGATEWAY=$GATEWAY|g' /etc/sysconfig/network" -noWait
VM=worker03;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "sed -i 's|# Created by anaconda|# Created by anaconda\nGATEWAY=$GATEWAY|g' /etc/sysconfig/network" -noWait
VM=worker04;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "sed -i 's|# Created by anaconda|# Created by anaconda\nGATEWAY=$GATEWAY|g' /etc/sysconfig/network" -noWait
VM=sm;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "sed -i 's|# Created by anaconda|# Created by anaconda\nGATEWAY=$GATEWAY|g' /etc/sysconfig/network" -noWait

### Disable DHCP (to avoid overwriting /etc/resolv.conf)

VM=cp;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "sed -i 's|BOOTPROTO=\"dhcp\"|BOOTPROTO=\"none\"|g'   /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=worker01;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "sed -i 's|BOOTPROTO=\"dhcp\"|BOOTPROTO=\"none\"|g'   /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=worker02;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "sed -i 's|BOOTPROTO=\"dhcp\"|BOOTPROTO=\"none\"|g'   /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=worker03;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "sed -i 's|BOOTPROTO=\"dhcp\"|BOOTPROTO=\"none\"|g'   /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=worker04;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "sed -i 's|BOOTPROTO=\"dhcp\"|BOOTPROTO=\"none\"|g'   /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait
VM=sm;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "sed -i 's|BOOTPROTO=\"dhcp\"|BOOTPROTO=\"none\"|g'   /etc/sysconfig/network-scripts/ifcfg-ens33" -noWait

 
### Reboot VIP Auth Hub CP and Nodes 

VM=cp;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "reboot" -noWait
VM=worker01;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "reboot" -noWait
VM=worker02;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "reboot" -noWait
VM=worker03;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "reboot" -noWait
VM=worker04;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "reboot" -noWait
VM=sm;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "reboot" -noWait

Step 11: Update DNS on the clone images remotely using vmrun

### Update /etc/resolv.conf for correct DNS server.
### Ensure DHCP and Network Manager are disable to prevent these services from overwrite behavior.

export DOMAIN=aks.iam.anapartner.net
export PASSWORD_VM=Password01
DNSNEW=192.168.2.20

VM=cp;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "echo 'nameserver $DNSNEW' >>  /etc/resolv.conf" -noWait
VM=worker01;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "echo 'nameserver $DNSNEW' >>  /etc/resolv.conf" -noWait
VM=worker02;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "echo 'nameserver $DNSNEW' >>  /etc/resolv.conf" -noWait
VM=worker03;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "echo 'nameserver $DNSNEW' >>  /etc/resolv.conf" -noWait
VM=worker04;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "echo 'nameserver $DNSNEW' >>  /etc/resolv.conf" -noWait
VM=sm;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "echo 'nameserver $DNSNEW' >>  /etc/resolv.conf" -noWait
 
 
### Reboot VIP Auth Hub CP and Nodes
 
VM=cp;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "reboot" -noWait
VM=worker01;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "reboot" -noWait
VM=worker02;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "reboot" -noWait
VM=worker03;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "reboot" -noWait
VM=worker04;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "reboot" -noWait
VM=sm;vmrun -T ws -gu root -gp $PASSWORD_VM runScriptInGuest $BASE/$VM/$VM.vmx  /bin/bash  "reboot" -noWait

Step 12: Copy the root .ssh public cert to your main host, rename it to a useful name and these test your newly deployed clone images for DSN resolution using ssh. Please confirm this step is successful prior to continuing with the configuration of the control plane and worker nodes.

### Copy the root id_rsa file to host system to allow ease of testing with ssh.

export CP=192.168.2.60
export WK1=192.168.2.61
export WK2=192.168.2.62
export WK3=192.168.2.63
export WK4=192.168.2.64
export SM=192.168.2.65

### Add the hosts for ssh pre-validation. 

ssh-keyscan -p 22 $CP >> ~/.ssh/known_hosts
ssh-keyscan -p 22 $WK1 >> ~/.ssh/known_hosts
ssh-keyscan -p 22 $WK2 >> ~/.ssh/known_hosts
ssh-keyscan -p 22 $WK3 >> ~/.ssh/known_hosts
ssh-keyscan -p 22 $WK4 >> ~/.ssh/known_hosts
ssh-keyscan -p 22 $SM >> ~/.ssh/known_hosts


### Rename from id_rsa to vip_kub_root_id_rsa

ssh -tt -i ~/vip_kub_root_id_rsa root@$CP 'cat /etc/resolv.conf'
ssh -tt -i ~/vip_kub_root_id_rsa root@$WK1 'cat /etc/resolv.conf'
ssh -tt -i ~/vip_kub_root_id_rsa root@$WK2 'cat /etc/resolv.conf'
ssh -tt -i ~/vip_kub_root_id_rsa root@$WK3 'cat /etc/resolv.conf'
ssh -tt -i ~/vip_kub_root_id_rsa root@$WK4 'cat /etc/resolv.conf'
ssh -tt -i ~/vip_kub_root_id_rsa root@$SM 'cat /etc/resolv.conf'


### Validate Access with ssh to CP and Worker Nodes new IP addresses.

FQDN=ssp.aks.iam.anapartner.net
ssh -tt -i ~/vip_kub_root_id_rsa root@$CP  "ping -c 2 $FQDN"
ssh -tt -i ~/vip_kub_root_id_rsa root@$WK1 "ping -c 2 $FQDN"
ssh -tt -i ~/vip_kub_root_id_rsa root@$WK2 "ping -c 2 $FQDN"
ssh -tt -i ~/vip_kub_root_id_rsa root@$WK3 "ping -c 2 $FQDN"
ssh -tt -i ~/vip_kub_root_id_rsa root@$WK4 "ping -c 2 $FQDN"
ssh -tt -i ~/vip_kub_root_id_rsa root@$SM "ping -c 2 $FQDN"

Update CP (controlplane) Node

Step 13a: Copy files to CP Node from Vmware Workstation host and configure the CP node for dedicated CP usage. Recommend using two terminals/sessions to speed up the process. Install HAproxy for Load Balancing, copy the Let’s Encrypt wild card certificates, and copy the Kubernetes solution you will be deploying (scripts/yaml).

### Open Terminal 1 to CP host.
### Add bash completion to have better use of TAB to view parameters.

CP=192.168.2.60
ssh -tt -i ~/vip_kub_root_id_rsa root@$CP
dnf -y install bash-completion
echo 'export KUBECONFIG=/etc/kubernetes/admin.conf'  >>~/.bashrc
kubectl completion bash >/etc/bash_completion.d/kubectl
echo "alias k=kubectl | complete -F __start_kubectl k" >>~/.bashrc

### Install HAProxy and replace the haproxy.cfg file.
dnf -y install haproxy
systemctl enable haproxy --now
netstat -anp | grep -i -e haproxy

### Open Terminal 2 to host and push files to CP node.
### Copy HAProxy configuration, certs, and scripts
scp -i ~/vip_kub_root_id_rsa  haproxy.cfg   root@$CP:/etc/haproxy/haproxy.cfg
scp -i ~/vip_kub_root_id_rsa  cloud-certs-aks-eks-gke_exp-202X-01-12.tar  root@$CP:
scp -i ~/vip_kub_root_id_rsa  202X-11-03_vip_auth_hub_working_centos7_v2.tar   root@$CP:

### On Terminal 1 - on CP host - Restart to use new haproxy configuration file.
systemctl restart haproxy
netstat -anp | grep -i -e haproxy

### Extract CERTS to root home folder
tar -xvf cloud-certs-aks-eks-gke_exp-202X-01-12.tar

### Extract Working Scripts 
tar -xvf 202X-11-03_vip_auth_hub_working_centos7_v2.tar

### Update env variables for unique environment within step00 file.
vi step00_kubernetes_env.sh

### Add the env variables to the .bashrc file
echo ". ./step00_kubernetes_env.sh"

Step 13b: Example of /etc/haproxy/haproxy.cfg configuration for Kubernetes Load Balancing functionality for on-prem worker nodes. HAproxy deployed on control plane (CP) node. The example configuration file will route TCP 80/443/389 to one (1) of the four (4) worker nodes. If a Kubernetes NodePort service is enabled for TCP 389 (31888) ports, then this load balancer will function correctly and route the traffic for LDAP traffic as well.

[root@cp ~]# cat /etc/haproxy/haproxy.cfg
global
    user haproxy
    group haproxy
    chroot /var/lib/haproxy
    log /dev/log    local0
    log /dev/log    local1 notice
defaults
    mode http
    log global
    retries 2
    timeout http-request 10s
    timeout queue 1m
    timeout connect 10s
    timeout client 10m
    timeout server 10m
    timeout http-keep-alive 10s
    timeout check 10s
    maxconn 3000
frontend ingress
    bind *:80
    option tcplog
    mode http
    option forwardfor
    option http-server-close
    default_backend kubernetes-ingress-nodes
backend kubernetes-ingress-nodes
    mode http
    balance roundrobin
    server k8s-ingress-0 worker01.aks.iam.anapartner.net:80 check fall 3 rise 2 send-proxy-v2
    server k8s-ingress-1 worker02.aks.iam.anapartner.net:80 check fall 3 rise 2 send-proxy-v2
    server k8s-ingress-2 worker03.aks.iam.anapartner.net:80 check fall 3 rise 2 send-proxy-v2
    server k8s-ingress-2 worker04.aks.iam.anapartner.net:80 check fall 3 rise 2 send-proxy-v2
frontend ingress-https
    bind *:443
    option tcplog
    mode tcp
    option forwardfor
    option http-server-close
    default_backend kubernetes-ingress-nodes-https
backend kubernetes-ingress-nodes-https
    mode tcp
    balance roundrobin
    server k8s-ingress-0 worker01.aks.iam.anapartner.net:443 check fall 3 rise 2 send-proxy-v2
    server k8s-ingress-1 worker02.aks.iam.anapartner.net:443 check fall 3 rise 2 send-proxy-v2
    server k8s-ingress-2 worker03.aks.iam.anapartner.net:443 check fall 3 rise 2 send-proxy-v2
    server k8s-ingress-2 worker04.aks.iam.anapartner.net:443 check fall 3 rise 2 send-proxy-v2
frontend ldap
    bind *:389
    option tcplog
    mode tcp
    default_backend kubernetes-nodes-ldap
backend kubernetes-nodes-ldap
    mode tcp
    balance roundrobin
    server k8s-ldap-0 worker01.aks.iam.anapartner.net:31888  check fall 3 rise 2
    server k8s-ldap-1 worker02.aks.iam.anapartner.net:31888  check fall 3 rise 2
    server k8s-ldap-2 worker03.aks.iam.anapartner.net:31888  check fall 3 rise 2
    server k8s-ldap-2 worker04.aks.iam.anapartner.net:31888  check fall 3 rise 2

Deploy Solution on Kubernetes

Step 14: Validate that DNS and Storage are ready before deploying any solution or if you wish to have a base Kubernetes environment to use with the control-plane and four (4). worker nodes.

### Step:  Setup NFS Share either on-prem remote server or Synology NFS
### Use version 4.x checkbox for Synology.

### Example of lines on remote Linux Host with NFS share.

yum -y install nfs-utils
systemctl enable --now nfs-server rpcbind
mkdir -p /export/nfsshare ; chown nobody /export/nfsshare ; chmod -R 777 /export/nfsshare
echo "/export/nfsshare *(rw,sync,no_root_squash,insecure)" >> /etc/exports
exportfs -rav; exportfs -v

firewall-cmd --add-service=nfs --permanent
firewall-cmd --add-service={nfs3,mountd,rpc-bind} --permanent 
firewall-cmd --reload 



#### Setup DNS entries (A and CNAME) for twelve (12) items ( May be on-prem DNS or Synology DNS)

ns.aks.iam.anapartner.net  A  IP_ADDRESS (192.168.2.60)
aks.iam.anapartner.net  NS ns.aks.iam.anapartner.net
cp.aks.iam.anapartner.net  A  IP_ADDRESS (192.168.2.60)
worker01.aks.iam.anapartner.net  A  IP_ADDRESS (192.168.2.61)
worker02.aks.iam.anapartner.net  A  IP_ADDRESS (192.168.2.62)
worker03.aks.iam.anapartner.net  A  IP_ADDRESS (192.168.2.63)
worker04.aks.iam.anapartner.net  A  IP_ADDRESS (192.168.2.64)
sm.aks.iam.anapartner.net  A  IP_ADDRESS (192.168.2.65)
kibana CNAME cp.aks.iam.anapartner.net 
grafana CNAME cp.aks.iam.anapartner.net 
jaeger CNAME cp.aks.iam.anapartner.net 
alertmanager CNAME cp.aks.iam.anapartner.net 
ssp CNAME cp.aks.iam.anapartner.net 
ssp-mgmt CNAME cp.aks.iam.anapartner.net 

### Pre-Step:  Enable DNS resolution for external IP addresses
### Enable forwarding to external h/w router and 8.8.8.8

Step 15: Recommendation. Deploy your solution in steps using Kubernetes yaml or Helm charts to assist with debugging any deployment issues. Do not forget to use kubectl logs, and kubectl describe to isolate startup or cert issues.

### Run scripts one-by-one.  They will have a watch command in each that will 
### provide feedback on the startup processes.
### Total startup from scratch to final with VIP Sample App is about 15-20 minutes.
### Note:  Step04 has a different chart variables for on-prem for Symantec Directory.
### Note:  /step00_kubernetes_env.sh is called by each script.


./step01_kubernetes_cluster_init_with_worker_nodes.sh
./step02_kubernetes_cluster_with_ingress_and_other_charts.sh
./step03_kubernetes_cluster_with_vip_auth_hub_charts.sh
./step04_kubernetes_cluster_with_vip_auth_hub_sample_app.sh

Docker Registry for On-Prem

There are two (2) types of docker registries we have found useful.

a. The standard Mirror method will capture all docker images from “docker.io” site to a local mirror. When Kubernetes or Helm deployments are used, the docker configuration file can be adjusted to check the local mirror without updating Kubernetes yaml files or Helm charts.

b. The second method is a full query of all images after they have been deployed once, and using the docker push process into a local registry. The challenge of the second method is that the Kubernetes yaml files and/or Helm charts do have to be updated to use this local registry.

Either method will help lower bandwidth cost to re-download the same docker images, if you use a docker prune method to keep your worker nodes disc size “clean”. If the docker prune process is not used, you may notice that the worker nodes may run out of disc space due to temporary docker images/containers that did not clean up properly.

#!/bin/bash
#################################################################################
#  Create a local docker mirror registry for docker-ios
#  and local docker non-mirror registry for all other images
#  to minimize download impact
#  during restart of the kubernetes solution
#
#  All registry iamges will be placed on NFS share
#  mount -v -t nfs 192.168.2.30:/volume1/nfs /mnt  &>/dev/null
#
# Certs will be provided by Let's Encrypt every 90 days
#
#  For docker-io mirror registry, all clients must have the following line in
#  /etc/docker/daemon.json     {Note:  Use commas as needed}
#
#    "registry-mirrors":
#     [
#      "https://sm.aks.iam.anapartner.net:444"
#     ],
#
#
#
# ANA 11/2021
#
#################################################################################
# To remove all containers - to allow restart of process
docker rm -f `docker ps -a | grep -v -e CONTAINER | awk '{print $1}'` ; docker image rm `docker image ls | grep -v -e REPOSITORY | grep -e minutes -e hour -e days -e '2 weeks'|  awk '{print $3}'` &>/dev/null


#################################################################################
# Update HOST name for local server for docker image
HOST=sm.aks.iam.anapartner.net
NFS_SERVER=192.168.2.30
NFS_SHARE=/volume1/nfs


#################################################################################
function start_registry {

    local_port=$1
    remote_registry_name=$2

    if [ "$3" == "" ]; then
        remote_registry_url=$remote_registry_name
    else
        remote_registry_url=$3
    fi

    echo -e "$local_port $remote_registry_name $remote_registry_url"


mount -v -t nfs $NFS_SERVER:$NFS_SHARE /mnt  &>/dev/null
mkdir -p /mnt/registry/${remote_registry_name}  &>/dev/null

docker run -d --name registry-${remote_registry_name}-mirror  \
-p $local_port:443 \
--restart=always \
-e REGISTRY_HTTP_ADDR=0.0.0.0:443 \
-e REGISTRY_PROXY_REMOTEURL="https://${remote_registry_url}/" \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/fullchain.pem \
-e REGISTRY_HTTP_TLS_KEY=/certs/privkey.pem \
-e REGISTRY_COMPATIBILITY_SCHEMA1_ENABLED=true \
-v /mnt/registry/certs:/certs \
-v /mnt/registry/${remote_registry_name}:/var/lib/registry \
registry:latest

sleep 1
echo "#################################################################################"
curl -s -X GET  https://$HOST:$local_port/v2/_catalog | jq
echo "#################################################################################"

}

#################################################################################
# start_registry <local_port>    <remote_registry_name>  <remote_registry_url>
#################################################################################

start_registry   444             docker-io               registry-1.docker.io

#################################################################################
# Non-Proxy configuration to allow 'docker tag & docker push' for all other images
#################################################################################

remote_registry_name=all
local_port=455
mkdir -p /var/lib/docker/registry/${remote_registry_name}  &>/dev/null
docker run -d --name registry-${remote_registry_name}-mirror  \
-p $local_port:443 \
--restart=always \
-e REGISTRY_HTTP_ADDR=0.0.0.0:443 \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/fullchain.pem \
-e REGISTRY_HTTP_TLS_KEY=/certs/privkey.pem \
-e REGISTRY_COMPATIBILITY_SCHEMA1_ENABLED=true \
-v /mnt/registry/certs:/certs \
-v /mnt/registry/${remote_registry_name}:/var/lib/registry \
registry:latest

sleep 1
echo "#################################################################################"
curl -s -X GET  https://$HOST:$local_port/v2/_catalog | jq
echo "#################################################################################"
docker ps -a
echo "#################################################################################"

echo "##### To tail the log of the docker-io container - useful for monitoring helm deployments  #####"
echo "docker logs `docker ps -a  --no-trunc | grep -v NAMES | grep 'docker-io' | awk '{print $1}'` -f "
echo "#################################################################################"
echo "##### To tail the log of the ALL container - useful for monitoring helm deployments  #####"
echo "docker logs `docker ps -a  --no-trunc | grep -v NAMES | grep 'all' | awk '{print $1}'` -f  "
echo "#################################################################################"
echo "##### Location of Registry Files on NFS share #####"
echo "ls -lart /mnt/registry/docker-io/docker/registry/v2/repositories"
echo "ls -lart /mnt/registry/all/docker/registry/v2/repositories"
echo "#################################################################################"

Example of the /etc/docker/daemon.json configuration file to use a local mirror for docker.io. See the parameter of “registry-mirrors”. Unfortunately, we were unable to use this process for the other docker registries.

{
"debug": false,
"data-root": "/home/docker-images",
"exec-opts": ["native.cgroupdriver=systemd"],
"storage-driver": "overlay2",
"registry-mirrors":
[
"https://sm.aks.iam.anapartner.net:444"
],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
}
}

Let’s Encrypt Certbot and DNS validation

Use Let’sEncrypt Certbox and manual DNS validation, to create our 90-day wild card certificates. Manual DNS validation allows us to avoid setting up a public-facing component for our internal labs.

Ref: https://letsencrypt.org/docs/challenge-types/

# Step 1:  Install SNAP service for Certbot usage on your host OS

cat /etc/redhat-release
Red Hat Enterprise Linux release 8.3 (Ootpa)

sudo yum install -y  snapd
Updating Subscription Management repositories.
Package snapd-2.49-2.el8.x86_64 is already installed.

systemctl enable --now snapd.socket

### Wait 1 min

snap install core; sudo snap refresh core



# Step 2: Remove prior certbot (if installed by yum/dnf)

yum remove -y certbot.


# Step 3:  Install new "classic" Certbot

sudo snap install --classic certbot
certbot 1.17.0 from Certbot Project (certbot-eff✓) installed

sudo ln -s /snap/bin/certbot /usr/bin/certbot



# Step 4: Issue certbot command with wildcard cert & update your DNS TXT record with the string provided.


sudo certbot certonly --manual  --preferred-challenges dns -d *.aks.iam.anapartner.org --register-unsafely-without-email

Saving debug log to /var/log/letsencrypt/letsencrypt.log

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Please read the Terms of Service at
https://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf. You must
agree in order to register with the ACME server. Do you agree?
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(Y)es/(N)o: Y
Account registered.
Requesting a certificate for *.aks.iam.anapartner.org

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Please deploy a DNS TXT record under the name:

_acme-challenge.iam.anapartner.org.

with the following value:

u2cXXXXXXXXXXXXXXXXXXXXc

Before continuing, verify the TXT record has been deployed. Depending on the DNS
provider, this may take some time, from a few seconds to multiple minutes. You can
check if it has finished deploying with aid of online tools, such as the Google
Admin Toolbox: https://toolbox.googleapps.com/apps/dig/#TXT/_acme-challenge.iam.anapartner.org.
Look for one or more bolded line(s) below the line ';ANSWER'. It should show the
value(s) you've just added.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

# Step 5:  In a 2nd terminal, validate that the DNS record has been updated and can be seen by a standard DNS query.   Have the 2nd console window open to test the DNS record, prior to <ENTER> key on verification request

# Example:
nslookup -type=txt _acme-challenge.aks.iam.anapartner.org
Non-authoritative answer:
_acme-challenge.aks.iam.anapartner.org  text = "u2cXXXXXXXXXXXXXXXXXXXXc"


# Step 6:  Press <ENTER> after you have validated the TXT record.

Press Enter to Continue
Waiting for verification...
Cleaning up challenges
Subscribe to the EFF mailing list (email: nala@baugher.us).

IMPORTANT NOTES:
 - Congratulations! Your certificate and chain have been saved at:
   /etc/letsencrypt/live/aks.iam.anapartner.org/fullchain.pem
   Your key file has been saved at:
   /etc/letsencrypt/live/aks.iam.anapartner.org/privkey.pem
  


# Step 7: View certs of fullchain.pem & privkey.pem  

cat /etc/letsencrypt/live/aks.iam.anapartner.org/fullchain.pem
-----BEGIN CERTIFICATE-----

<REMOVED>
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
<REMOVED>
-----END CERTIFICATE-----

cat /etc/letsencrypt/live/aks.iam.anapartner.org/privkey.pem
-----BEGIN PRIVATE KEY-----

<REMOVED>
-----END PRIVATE KEY-----




# Step 8:  Use the two files for your kubernetes solution 

# Step 9:  Ensure domain on host OS, cp, worker nodes in /etc/resolv.conf is set correctly to aks.iam.anapartner.org    to allow the certs to be resolved correctly.

# Step 10:  Ensure Synology NAS DNS service is configurated with all alias 


# Step 11:  Optional: Validate certs with openssl


# Show the kubernetes self-signed cert

true | openssl s_client -connect kibana.aks.iam.anapartner.org:443 2>/dev/null | openssl x509 -inform pem -noout -text

# Show the new wildcard cert for same hostname &  port

curl -vvI  https://kibana.aks.iam.anapartner.org/app/home#/

curl -vvI  https://kibana.aks.iam.anapartner.org/app/home#/   2>&1 | awk 'BEGIN { cert=0 } /^\* SSL connection/ { cert=1 } /^\*/ { if (cert) print }'

nmap -p 443 --script ssl-cert kibana.aks.iam.anapartner.org


Kubernetes Side Note:   Let's Encrypt certs do NOT show up within the Kubernetes cluster certs check process.

kubeadm certs check-expiration

View of the DNS TXT records to be updated with your DNS service provider. The Let’sEncrypt Certbot will need to be able to query these records for it to assign you wildcard certificates. Create the _acme-challenge hostname entry as a TXT type, and paste in the string provided by the Let’sEncrypt Certbot process. Wait 5 minutes or test the TXT record with nslookup, then upon positive validation, continue the Let’sEncrypt Certbot process.

View your kubernetes cluster / nodes for any constraints

After your cluster is created and you have worker nodes joined to the cluster, you may wish to monitor for any constraints of your on-prem deployment. Kubectl command with the action verb of describe or top is very useful for this goal.

kubectl describe nodes worker01
kubectl top node / kubectl top pod

Kubernetes Training (Formal)

If you are new to Kubernetes, we recommend the following class. You may need to dedicate 4-8 weeks to complete the course and then take the CKA exam via the Linux Foundation.

https://www.udemy.com/course/certified-kubernetes-administrator-with-practice-tests/ .

Kubernetes.io site has most of the information you need to get started.

https://kubernetes.io/docs/reference/kubectl/cheatsheet/

Global Password Reset

The recent DNS challenges for a large organization that impacted their worldwide customers bring to mind a project we completed this year, a global password reset redundancy solution.

We worked with a client who desired to manage unplanned WAN outages to their five (5) data centers for three (3) independent MS Active Directory Domains with integration to various on-prem applications/ endpoints. The business requirement was for self-service password sync, where the users’ password change process is initialed/managed by the two (2) different MS Active Directory Password Policies.

Without the WAN outage requirement, any IAM/IAG solution may manage this request within a single data center. A reverse password sync agent process is enabled on all writable MS Active Directory domain controllers (DC). All the world-wide MS ADS domain controllers would communicate to the single data center to validate and resend this password change to all of the users’ managed endpoint/application accounts, e.g. SAP, Mainframe (ACF2/RACF/TSS), AS/400, Unix, SaaS, Database, LDAP, Certs, etc.

With the WAN outage requirement, however, a queue or components must be deployed/enabled at each global data center, so that password changes are allowed to sync locally to avoid work-stoppage and async-queued to avoid out-of-sync password to the other endpoint/applications that may be in other data centers.

We were able to work with the client to determine that their current IAM/IAG solution would have the means to meet this requirement, but we wished to confirm no issues with WAN latency and the async process. The WAN latency was measured at less than 300 msec between remote data centers that were opposite globally. The WAN latency measured is the global distance and any intermediate devices that the network traffic may pass through.

To review the solution’s ability to meet the latency issues, we introduced a test environment to emulate the global latency for deployment use-cases, change password use-cases, and standard CrUD use-cases. There is a feature within VMWare Workstation, that allows emulation of degraded network traffic. This process was a very useful planning/validation tool to lower rollback risk during production deployment.

VMWare Workstation Network Adapter Advance Settings for WAN latency emulation

The solution used for the Global Password Rest solution was Symantec Identity Suite Virtual Appliance r14.3cp2. This solution has many tiers, where select components may be globally deployed and others may not.

We avoided any changes to the J2EE tier (Wildfly) or Database for our architecture as these components are not supported for WAN latency by the Vendor. Note: We have worked with other clients that have deployment at two (2) remote data centers within 1000 km, that have reported minimal challenges for these tiers.

We focused our efforts on the Provisioning Tier and Connector Tier. The Provisioning Tier consists of the Provisioning Server and Provisioning Directory.

The Provisioning Server has no shared knowledge with other Provisioning Servers. The Provisioning Directory (Symantec Directory) is where the provisioning data may be set up in a multi-write peer model. Symantec Directory is a proper X.500 directory with high redundancy and is designed to manage WAN latency between remote data centers and recovery after an outage. See example provided below.

https://techdocs.broadcom.com/us/en/symantec-security-software/identity-security/directory/14-1/ca-directory-concepts/directory-replication/multiwrite-mw-replication.html

The Connector Tier consists of the Java Connector Server and C++ Connector Server, which may be deployed on MS Windows as an independent component. There is no shared knowledge between Connector Servers, which works in our favor.

Requirement:

Three (3) independent MS Active Directory domain in five (5) remote data centers need to allow self-service password change & allow local password sync during a WAN outage. Passwords changes are driven by MS ADS Password Policies (every N days). The IME Password Policy for IAG/IAM solution is not enabled, IME authentication is redirected to an ADS domain, and the IMPS IM Callback Feature is disabled.

Below is an image that outlines the topology for five (5) global data centers in AMER, EMEA, and APAC.

The flow diagram below captures the password change use-case (self-service or delegated), the expected data flow to the user’s managed endpoints/applications, and the eventual peer sync of the MS Active Directory domain local to the user.

Observation(s):

The standalone solution of Symantec IAG/IAM has no expected challenges with configurations, but the Virtual Appliance offers pre-canned configurations that may impact a WAN deployment.

During this project, we identified three (3) challenges using the virtual appliance.

Two (2) items needed the assistance of the Broadcom Support and Engineering teams. They were able to work with us to address deployment configuration challenges with the “check_cluster_clock_sync -v ” process that incorrectly increments time delays between servers instead of resetting a value of zero between testing between servers.

Why this is important? The “check_cluster_clock_sync” alias is used during auto-deployment of vApp nodes. If the time reported between servers is > 15 seconds then replication may fail. This time check issue was addressed with a hotfix. After the hot-fix was deployed, all clock differences were resolved.

The second challenge was a deployment challenge of the IMPS component for its embedded “registry files/folders”. The prior embedded copy process was observed to be using standard “scp”. With a WAN latency, the scp copy operation may take more than 30 seconds. Our testing with the Virtual Appliance showed that a simple copy would take over two (2) minutes for multiple small files. After reviewing with CA support/engineering, they provided an updated copy process using “rsync” that speeds up copy performance by >100x. Before this update, the impact was provisioning tier deployment would fail and partial rollback would occur.

The last challenge we identified was using the Symantec Directory’s embedded features to manage WAN latency via multi-write HUB groups. The Virtual Appliance cannot automatically manage this feature when enabled in the knowledge files of the provisioning data DSAs. Symantec Directory will fail to start after auto-deployment.

Fortunately, on the Virtual appliance, we have full access to the ‘dsa’ service ID and can modify these knowledge files before/after deployment. Suppose we wish to roll back or add a new Provisioning Server Virtual Appliance. In that case, we must disable the multi-write HUB group configuration temporarily, e.g. comment out the configuration parameter and re-init the DATA DSAs.

Six (6) Steps for Global Password Reset Solution Deployment

We were able to refine our list of steps for deployment using pre-built knowledge files and deployment of the vApp nodes in blank slates with the base components of Provisioning Server (PS) and Provisioning Directory) with a remote MS Windows server for the Connector Server (JCS/CCS).

Step 1: Update Symantec Directory DATA DSA’s knowledge configuration files to use the multiple group HUB model. Note that multi-write group configuration is enabled within the DATA DSA’s *.dxc files. One Directory servers in each data center will be defined as a “HUB”.

Ref: https://techdocs.broadcom.com/us/en/symantec-security-software/identity-security/directory/14-1/ca-directory-concepts/directory-replication/multiwrite-mw-groups-hubs/topology-sample-and-disaster-recovery.html

To assist this configuration effort, we leveraged a serials of bash shell scripts that could be pasted into multiple putty/ssh sessions on each vApp to replace the “HUB” string with a “sed” command.

After the HUB model is enabled (stop/start the DATA DSAs), confirm that delayed WAN latency has no challenge with Symantec Directory sync processes. By monitoring the Symantec Directory logs during replication, we can see that sync operation with the WAN latency is captured with the delay > 1 msecs between data centers AMER1 and APAC1.

Step 2: Update IMPS configurations to avoid delays with Global Password Reset solution.

Note for this architecture, we do not use external IME Password Policies. We ensure that each AD endpoint has the checkbox enabled for “Password synchronization agent is installed” & each Global User (GU) has “Enable Password Synchronization Agent” checkbox enabled to prevent data looping. To ensure this GU attribute is always enabled, we updated an attribute under “Create Users Default Attributes”.

Step 3a: Update the Connector Tier (CCS Component)

Ensure that the MS Windows Environmental variables for the CCS connector are defined for Failover (ADS_FAILOVER) and Retry (ADS_RETRY).

Step 3b: Update the CCS DNS knowledge file of ADS DCs hostnames.

Important Note: Avoid using the refresh feature “Refresh DC List” within the IMPS GUI for the ADS Endpoint. If this feature is used, then a “merge” will be processed from the local CCS DNS file contents and what is defined within the IMPS GUI refresh process. If we wish to manage the redirection to local MS ADS Domain Controllers, we need to control this behavior. If this step is done, we can clean out the Symantec Directory of extra entries. The only negative aspect is the local password change may attempt to communicate to one of the remote MS ADS Domain Controllers that are not within the local data center. During a WAN outage, a user would notice a delay during the password change event while the CCS connector timed out the connection until it connected to the local MS ADS DC.

Step 3c: CCS ADS Failover

If using SSL over TCP 636 confirm the ADS Domain Root Certificate is deployed to the MS Windows Server where the CCS service is deployed. If using SASL over TCP 389 (if available), then no additional effort is required.

If using SSL over TCP 636, use the MS tool certlm.msc to export the public root CA Certificate for this ADS Domain. Export to base64 format for import to the MS Windows host (if not already part of the ADS Domain) with the same MS tool certlm.msc.

Step 4a: Update the Connector Tier for the JCS component.

Add the stabilization parameter “maxWait” to the JCS/CCS configuration file. Recommend 10-30 seconds.

Step 4b: Update JCS registration to the IMPS Tier

You may use the Virtual Appliance Console, but this has a delay when pulling the list of any JCS connector that may be down at this time of the check/submission. If we use the Connector Xpress UI, we can accomplish the same process much faster with additional flexibility for routing rules to the exact MS ADS Endpoints in the local data center.

Step 4c: Observe the IMPS routing to JCS via etatrans log during any transaction.

If any JCS service is unavailable (TCP 20411), then the routing rules process will report a value of 999.00, instead of a low value of 0.00-1.00.

Step 5: Update the Remote Password Change Agent (DLL) on MS ADS Domain Controllers (writable)

Step 6a: Validation of Self-Service Password Change to selected MS ADS Domain Controller.

Using various MS Active Directory processes, we can emulate a delegated or self-service password change early during the configuration cycle, to confirm deployment is correct. The below example uses MS Powershell to select a writable MS ADS Domain Controller to update a user’s password. We can then monitor the logs at all tiers for completion of this password change event.

A view of the password change event from the Reverse Password Sync Agent log file on the exact MS Domain Controller.

Step 6b: Validation of password change event via CCS ADS Log.

Step 6c: Validation of password change event via IMPS etatrans log

Note: Below screenshot showcases alias/function to assist with monitoring the etatrans logs on the Virtual Appliance.

Below screen shot showcases using ldapsearch to check timestamps for before/after of password change event within MS Active Directory Domain.

We hope these notes are of some value to your business and projects.

Appendix

Using the MS Windows Server for CCS Server 

Get current status of AD account on select DC server before Password Change:

PowerShell Example:

get-aduser -Server dc2012.exchange2020.lab   "idmpwtest"  -properties passwordlastset, passwordneverexpires | ft name, passwordlastset

LdapSearch Example:  (using ldapsearch.exe from CCS bin folder - as the user with current password.)

C:\> & "C:\Program Files (x86)\CA\Identity Manager\Connector Server\ccs\bin\ldapsearch.exe" -LLL -h dc2012.exchange2012.lab -p 389 -D "cn=idmpwtest,cn=Users,DC=exchange2012,DC=lab" -w "Password05" -b "CN=idmpwtest,CN=Users,DC=exchange2012,DC=lab" -s base pwdLastSet

Change AD account's password via Powershell:
PowerShell Example:

Set-ADAccountPassword -Identity "idmpwtest" -Reset -NewPassword (ConvertTo-SecureString -AsPlainText "Password06" -Force) -Server dc2016.exchange.lab

Get current status of AD account on select DC server after Password Change:

PowerShell Example:

get-aduser -Server dc2012.exchange2020.lab   "idmpwtest"  -properties passwordlastset, passwordneverexpires | ft name, passwordlastset

LdapSearch Example:  (using ldapsearch.exe from CCS bin folder - as the user with NEW password)

C:\> & "C:\Program Files (x86)\CA\Identity Manager\Connector Server\ccs\bin\ldapsearch.exe" -LLL -h dc2012.exchange2012.lab -p 389 -D "cn=idmpwtest,cn=Users,DC=exchange2012,DC=lab" -w "Password06" -b "CN=idmpwtest,CN=Users,DC=exchange2012,DC=lab" -s base pwdLastSet

Using the Provisioning Server for password change event

Get current status of AD account on select DC server before Password Change:
LDAPSearch Example:   (From IMPS server - as user with current password)

LDAPTLS_REQCERT=never  ldapsearch -LLL -H ldaps://192.168.242.154:636 -D 'CN=idmpwtest,OU=People,dc=exchange2012,dc=lab'  -w  Password05   -b "CN=idmpwtest,OU=People,dc=exchange2012,dc=lab" -s sub dn pwdLastSet whenChanged


Change AD account's password via ldapmodify & base64 conversion process:
LDAPModify Example:

BASE64PWD=`echo -n '"Password06"' | iconv -f utf8 -t utf16le | base64 -w 0`
ADSHOST='192.168.242.154'
ADSUSERDN='CN=Administrator,CN=Users,DC=exchange2012,DC=lab'
ADSPWD='Password01!’

ldapmodify -v -a -H ldaps://$ADSHOST:636 -D "$ADSUSERDN" -w "$ADSPWD" << EOF
dn: CN=idmpwtest,OU=People,dc=exchange2012,dc=lab 
changetype: modify
replace: unicodePwd
unicodePwd::$BASE64PWD
EOF

Get current status of AD account on select DC server after Password Change:
LDAPSearch Example:   (From IMPS server - with user's account and new password)

LDAPTLS_REQCERT=never  ldapsearch -LLL -H ldaps://192.168.242.154:636 -D 'CN=idmpwtest,OU=People,dc=exchange2012,dc=lab' -w  Password06   -b "CN=idmpwtest,OU=People,dc=exchange2012,dc=lab" -s sub dn pwdLastSet whenChanged

ADS Endpoint Configuration Challenges and Hints

  1. Ensure the hostname entry is a FQDN or alias. It can not be an IP address if MS Exchange is to be managed through this connector, due to conflict with Kerberos authentication and IP addresses. If the object was created with an IP address, it may be changed via Jxplorer for two (2) attributes: eTADSprimaryServer and eTADSServerName.

2. General Information on the ADS Endpoint Logging Tab and where this information is stored. Only two (2) the Destination have value with current deployment, e.g. Text File & System Log (MS Windows Event viewer) for Active Directory (ADS). The “Text File” will output data to two (2) files: jcs\logs\ADS\<endpoint-name>.log and ccs\logs\ADS\<endpoint-name>.log

3. Use the MS Event Viewer on the ADS Domain Controller, or use the MS Event Viewer to remotely view the transactions on the remote ADS DC. Select the event codes of 627,628,4723,4724,4738 to start with. Other codes may be added that are useful. Ref: https://docs.microsoft.com/en-us/windows-server/identity/ad-ds/plan/appendix-l–events-to-monitor

4. Additionally, the User ID may be in one of three (3) formats: UPN (serviceid@exchange.lab), NT ( domain\serviceid ), LDAP DN ( cn=serviceid,ou=people,dc=exchange,dc=lab). We recommend UPN or NT format to allow the embedded API features for MS Exchange powershell management to correctly function. If the ID is to be changed, a password update must be done as well, since the User ID is part of the seed for the encrypted password for the service ID to be stored in CA Directory on the ADS endpoint object.

5. SASL versus TLS authentication checkboxes. We can tested the ADS authentication availability using ldapsearch binary. Ports used by Active Directory for authentication by client tools, https://docs.microsoft.com/en-us/troubleshoot/windows-server/identity/config-firewall-for-ad-domains-and-trusts

Note: SASL is encrypted traffic. If wireshark is used to intercept the traffic, the service ID may be seen during initial authentication, but NOT the password nor the payload data.

Notes on SASL validation for Active Directory. {Pro: No need to worry about TLS certificates rotation on client connections – all TLS is managed by the server}

:: Search ADS / LDAP store what is offered for SASL (use -x for simple connection)
ldapsearch -x -h dc2016.exchange.lab -p 389 -b “” -LLL -s base supportedSASLMechanisms

EXAMPLE OUTPUT

[root@oracle ~]# ldapsearch -x -h dc2016.exchange.lab -p 389 -b “” -LLL -s base supportedSASLMechanisms
dn:
supportedSASLMechanisms: GSSAPI
supportedSASLMechanisms: GSS-SPNEGO
supportedSASLMechanisms: EXTERNAL
supportedSASLMechanisms: DIGEST-MD5

:: On Linux OS, execute rpm -qa to search for SASL installed modules/libraries.
rpm -qa | grep cyrus

EXAMPLE OUTPUT

[root@oracle ~]# rpm -qa | grep cyrus
cyrus-sasl-gssapi-2.1.26-23.el7.x86_64
cyrus-sasl-lib-2.1.26-23.el7.x86_64
cyrus-sasl-md5-2.1.26-23.el7.x86_64

:: On Linux OS, install missing SASL libraries & ldapsearch (ldap-client)
yum -y install cyrus-sasl-md5 cyrus-sasl-gssapi openldap-clients

TESTING DIFFERING AUTHENTICATION MECHANISMS #### (may remove -d9 debug switch to view cleaner results)

TLS

LDAPTLS_REQCERT=never ldapsearch -d9 -LLL -H ldaps://dc2016.exchange.lab:636 -w CAdemo123 -D “CN=Administrator,CN=Users,DC=exchange,DC=lab” -b “CN=Administrator,CN=Users,DC=exchange,DC=lab” -s base userAccountControl

Start TLS

LDAPTLS_REQCERT=never ldapsearch -d9 -Z -LLL -H ldap://dc2016.exchange.lab:389 -w CAdemo123 -D “CN=Administrator,CN=Users,DC=exchange,DC=lab” -b “CN=Administrator,CN=Users,DC=exchange,DC=lab” -s base userAccountControl

Digest-MD5

ldapsearch -d9 -LLL -H ldap://dc2016.exchange.lab -w CAdemo123 -Y DIGEST-MD5 -U Administrator -b “CN=Administrator,CN=Users,DC=exchange,DC=lab” -s base userAccountControl

Kerberos (GSS)

ldapsearch -d9 -LLL -H ldap://dc2016.exchange.lab -w CAdemo123 -Y GSSAPI -U Administrator -b “CN=Administrator,CN=Users,DC=exchange,DC=lab” -s base userAccountControl

6. TCP/UDP Ports required for Active Directory Endpoint management per CA Documentation https://techdocs.broadcom.com/us/en/symantec-security-software/identity-security/identity-manager/14-4/reference/default-ports-for-ca-identity-manager-and-associated-components.html

SASL appears to connect on TCP 636 briefly, then use TCP 389 extensively. Other ports are 80 (Service), 135 (lsass.exe for home folders), 6405 (lsass.exe). If Kerberos authentication is defined for the service ID, then other ports will be used, e.g. 3268/3269. TCP 4104/4105 are for the legacy CAM/CAFT agents (typically not used any more).

Recommendation: Add these TCP Ports to any Firewall between the IM JCS/CCS Server and the Active Directory Domain Controllers to improve performance and avoid time-out delays.

MS Active Directory References on SASL.

https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-adts/989e0748-0953-455d-9d37-d08dfbf3998b

https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-adts/a98c1f56-8246-4212-8c4e-d92da1a9563b

Parallel provisioning for Active Directory and MS Exchange mailboxes – Improve Birthright/DayOne Access

One of the challenges that IAM/IAG solutions may have is using single thread processing for select endpoints. For the CA/Symantec Identity Management solution, before IM r14.3cp2, we lived with a single-threaded connector to managed MS Active Directory endpoints.

To address this challenge, we deployed multiple connector servers. We allowed the IM Provisioning Server (IMPS) to use a built-in round-robin approach of load-balancing separate transactions to different connector servers, which would service the same Active Directory endpoints.

The IME may be running as fast as it can with its clustered deployment, but as soon as a task has MS Active Directory, and there is a bottleneck with the CCS Service. We begin to see the IME JMS queue reporting that it is stuck and the IME View Submitted Task reporting “In Progress” for all tasks. If the CCS service is restarted, all IME tasks are then reported as “Failed.”

This is/was the bottleneck for the solution for sites that have MS Active Directory for Birthright/DayOne Access.

We can now avoid this bottleneck. [*** (5/24/2021) – There is an enhancement to CP2 to address im_ccs.exe crashes during peak loads discovered using this testing process. ]

Via the newly delivered enhancement https://community.broadcom.com/participate/ideation-home/viewidea?IdeationKey=7154e15b-085d-469e-bff0-ac588ff6bd5b .

We now have full parallel provisioning to MS Active Directory from a single connector server (JCS/CCS).

The new attribute that regulates this behavior is eTADSMaxConnectionsInPool. This attribute will be applied on every existing ADS endpoint that is currently being managed by the IM Provisioning Server after CP2 is deployed. Note: The default value is 10, but we recommend after much testing, to match the value of the IMPS-> JCS and JCS->CCS to equal 200.

During testing within the IME using Bulk Tasks or the IM BLC, we can see that the CCS-> ADS traffic will reach 20-30 connections if allowed. You may set this attribute to a value of 200 via Jxplorer and/or an ldapmodify/dxmodify script.

echo "############### SET ADS MAX CONNECTIONS IN POOL SIZE ##################"
IMPS_HOST=192.168.242.135
IMPS_PORT=20389
IMPS_USER='eTGlobalUserName=etaadmin,eTGlobalUserContainerName=Global Users,eTNamespaceName=CommonObjects,dc=im,dc=eta'
IMPS_PWD="Password01"
NAMESPACE=exchange2016
LDAPTLS_REQCERT=never dxmodify -H ldap://$IMPS_HOST:$IMPS_PORT -c -x -D "$IMPS_USER" -w "$IMPS_PWD" << EOF
dn: eTADSDirectoryName=$NAMESPACE,eTNamespaceName=ActiveDirectory,dc=im,dc=eta
changetype: modify
eTADSMaxConnectionsInPool: 200
EOF
LDAPTLS_REQCERT=never dxsearch -LLL -H ldap://$IMPS_HOST:$IMPS_PORT -x -D "$IMPS_USER" -w "$IMPS_PWD" -b "eTADSDirectoryName=$NAMESPACE,eTNamespaceName=ActiveDirectory,dc=im,dc=eta" -s base eTADSMaxConnectionsInPool | perl -p00e 's/\r?\n //g'

To confirm the number of open connections is greater than one (1), we can issue a Bulk IM Task or use a performance tool like CA Directory dxsoak.

In this example, we will show case using CA Directory dxsoak to execute 100 parallel threads to create 100 ADS Accounts with MS Exchange Mailboxes. We will also enclose this script for download for others to review and use.

Performance Lab:

Pre-Steps:

  1. Leverage CA Directory samples’ dxsoak binary (performance testing). You may wish to use CA Directory on an existing IM Provisioning Server (Linux OS) or you may deploy CA Directory (MS Windows version) to the JCS/CCS connector. Examples are provided for both OSes.
  2. Create LDIF files for IM Provisioning Server and/or IM Connector Tier. This file is needed to ‘push’ the solution to-failure. The use of the IME Bulk Task and/or etautil scripts to the IM Provisioning Tier, will not provide the transaction speed we need to break the CCS service if possible.
  3. Within the IM Provisioning Manager enable the ADS Endpoint TXT Logs on the Logging TAB, for all checkboxes.
  4. Monitor the IMPS etatrans* logs, monitor the JCS ADS logs, monitor the CCS ADS logs, monitor the number of CCS-> ADS (LDAP/S – TCP 389/636) threads. [Suggest using MS Sysinternals Process Explorer and select im_ccs.exe & then TCP/IP TAB]
  5. Monitor the MS ADS Domain via MS ADUC (AD Users & Computers UI) and MS Exchange Mailbox (Mailbox UI via Browser)

Execution:

6. Perform a UNIT TEST with dxmodify/ldapmodify to confirm the LDIF file input is correct with the correct suffix.

time dxmodify -H ldap://192.168.242.135:20389 -c -x -D "eTGlobalUserName=etaadmin,eTGlobalUserContainerName=Global Users,eTNamespaceName=CommonObjects,dc=im,dc=eta" -w Password01 -f ads_user_with_exchange_dc_eta.ldif

7. Perform the PERFORMANCE TEST with dxsoak binary with the same LDIF file & correct suffix. Rate observed = 23 K ids/hr

./dxsoak -c -l 60 -t 100 -h 192.168.242.135:20389 -D "eTGlobalUserName=etaadmin,eTGlobalUserContainerName=Global Users,eTNamespaceName=CommonObjects,dc=im,dc=eta" -w Password01 -f ads_user_with_exchange_dc_eta.ldif

Observations:

8. IMPS etatrans*.log – Count the number of operations per second. Note any RACE and/or data collisions, e.g. ADS accounts deleted prior to add via 100 threads or ADS account created multiple times attempted in different threads.

9. IM CCS ADS <endpoint>.log – Will only have useful data if the ADS Endpoint Logging TAB has been checked for TXT logs.

10. Finally, validate directly in MS Active Domain with the ADUC or similar tool & MS Exchange mailboxes being created/deleted.

11. Count the number of threads from im_ccs.exe to ADS – Suggest using MS Sysinternals Process Explorer tool and/or Powershell to count the number of connections.

MS Powershell Script to count the number of LDAP (TCP 389) connection from im_ccs.exe. [Note: TCP 389 is used more if the ADS Endpoint is setup to use SASL authentication. TCP 636 is used more if the ADS Endpoint is using the older TLS authentication]

$i=1
Do {
cls
(Get-NetTCPConnection -State Established -OwningProcess (Get-Process -name im_ccs).id -RemotePort 389).count
Start-Sleep -s 1
$i++
}
while ($i -le 5)

Direct Performance Testing to JCS/CCS Service

While this testing has limited value, it can offer satisfaction and assistance to troubleshoot any challenges. We can use the prior LDIF files with a slightly different suffix, dc=etasa (instead of dc=eta), to use dxsoak to push the connector tier to failure. This step helped provide memory dumps back to CA/Symantec Engineering teams to help isolate challenges within the parallel processing. CCS Service is only exposed via localhost. If you wish to test the CCS Service remotely, then update the MS Registry key for the CCS service to use the external IP address of the JCS/CCS Server. Rate observed = 25 K ids/hr

Script to generate 100 ADS Accounts with MS Exchange Mailbox Creation

You may wish to review this script and adjust it for your ADS / MS Exchange domains for testing. You can also create a simple LDIF file with password resets or ADS group membership adds. Just remember that the IMPS Service (TCP 20389/20390) uses the suffix dc=eta, and the IM JCS/CCS Services (TCP 20410/20411) & (TCP 20402/20403) use the suffix dc=etasa. Additionally, if using CA Directory dxsoak, only use the non-TLS ports, as this binary is not equipped for using TLS certs.

#!/bin/bash
#######################################################################################################################
# Name:  Generate ADS Feed Files for IM Solution Provisioning/Connector Tiers
#
# Goal:  Validate the new parallel processes from the IM Connector Tier to Active Directory with MS Exchange
#
#
# Generate ADS User LDIF file(s) for use with unit (dxmodify) and performance testing (dxsoak) to:
#  - {Note: dxsoak will only work with non-TLS ports}
#
# IM JCS (20410)  "dc=etasa"    {Ensure MS Windows Firewall allows this port to be exposed}
# IM CCS (20402)  "dc=etasa"    {This port is localhost only, may open to network traffic via registry update}
# IMPS (20389)    "dc=eta"
#
#
# Monitor:  
#
# The IMPS etatrans*.log  {exclude searches}
# The JCS daily log
# The JCS ADS log {Enable the ADS Endpoint TXT logging for all checkboxes}
# The CCS ADS log {Enable the ADS Endpoint TXT logging for all checkboxes}
#
# Execute per the examples provided during run of this file
#
#
# ANA 05/2021
#######################################################################################################################

# Unique Variables for an ADS Domain
NAMESPACE=exchange2016
ADSDOMAIN=exchange.lab
DCDOMAIN="DC=exchange,DC=lab"
OU=People

#######################################################################################################################


MAX=100
start=00001
counter=$start
echo "###############################################################"
echo "###############################################################"
START=`/bin/date --utc +%Y%m%d%H%M%S,%3N.0Z`
echo `/bin/date --utc +%Y%m%d%H%M%S,%3N.0Z`" = Current OS UTC time stamp"
echo "###############################################################"
FILE1=ads_user_with_exchange_dc_etasa.ldif
FILE2=ads_user_with_exchange_dc_eta.ldif
echo "" > $FILE1
while [ $counter -le $MAX ]
do
    n=$((10000+counter)); n=${n#1}
    tz=`/bin/date --utc +%Y%m%d%H%M%S,3%N.0Z`
   echo "Counter with leading zeros = $n   at time:  $tz"


cat << EOF >> $FILE1
dn:  eTADSAccountName=firstname$n aaalastname$n,eTADSOrgUnitName=$OU,eTADSDirectoryName=$NAMESPACE,eTNamespaceName=ActiveDirectory,dc=im,dc=etasa
changetype: add
objectClass:  eTADSAccount
eTADSobjectClass:  user
eTADSAccountName:  firstname$n aaalastname$n
eTADSgivenName:  firstname$n
eTADSsn:  aaalastname$n
eTADSdisplayName:  firstname$n aaalastname$n
eTADSuserPrincipalName:  aaatestuser$n@$ADSDOMAIN
eTADSsAMAccountName:  aaatestuser$n
eTPassword:  Password01
eTADSpwdLastSet:  -1
eTSuspended:  0
eTADSuserAccountControl:  0000000512
eTADSDescription:  description $tz
eTADSphysicalDeliveryOfficeName:  office
eTADStelephoneNumber:  111-222-3333
eTADSmail:  aaatestuser$n@$ADSDOMAIN
eTADSwwwHomePage:  web.page.lab
eTADSotherTelephone:  111-222-3333
eTADSurl:  other.web.page.lab
eTADSstreetAddress:  street address line01
eTADSpostOfficeBox:  pobox 111
eTADSl:  city
eTADSst:  state
eTADSpostalCode:  11111
eTADSco:  UNITED STATES
eTADSc:  US
eTADScountryCode:  840
eTADSscriptPath:  loginscript.cmd
eTADSprofilePath:  \profile\path\here
eTADShomePhone:  111-222-3333
eTADSpager:  111-222-3333
eTADSmobile:  111-222-3333
eTADSfacsimileTelephoneNumber:  111-222-3333
eTADSipPhone:  111-222-3333
eTADSinfo:  Notes Here
eTADSotherHomePhone:  111-222-3333
eTADSotherPager:  111-222-3333
eTADSotherMobile:  111-222-3333
eTADSotherFacsimileTelephoneNumber:  111-222-3333
eTADSotherIpPhone:  111-222-3333
eTADStitle:  title
eTADSdepartment:  department
eTADScompany:  company
eTADSmanager:  CN=manager_fn manager_ln,OU=$OU,$DCDOMAIN
eTADSmemberOf:  CN=Backup Operators,CN=Builtin,$DCDOMAIN
eTADSlyncSIPAddressOption: 0000000000
eTADSdisplayNamePrintable: aaatestuser$n
eTADSmailNickname: aaatestuser$n
eTADShomeMDB: (Automatic Mailbox Distribution)
eTADShomeMTA: CN=DC001,CN=Servers,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=First Organization,CN=Microsoft Exchange,CN=Services,CN=Configuration,$DCDOMAIN
eTAccountStatus: A
eTADSmsExchRecipientTypeDetails: 0000000001
eTADSmDBUseDefaults: TRUE
eTADSinitials: A
eTADSaccountExpires: 9223372036854775807

EOF
 counter=$(( $counter + 00001 ))
done


#  Create the delete ADS Process
start=00001
counter=$start
while [ $counter -le $MAX ]
do
    n=$((10000+counter)); n=${n#1}
    tz=`/bin/date --utc +%Y%m%d%H%M%S,3%N.0Z`
   echo "Counter with leading zeros = $n   at time:  $tz"


cat << EOF >> $FILE1
dn:  eTADSAccountName=firstname$n aaalastname$n,eTADSOrgUnitName=$OU,eTADSDirectoryName=$NAMESPACE,eTNamespaceName=ActiveDirectory,dc=im,dc=etasa
changetype: delete

EOF
 counter=$(( $counter + 00001 ))
done

echo ""
echo "################################### ADS USER OBJECT STATS ################################################################"
echo "Number of add objects: `grep "changetype: add" $FILE1 | wc -l`"
echo "Number of delete objects: `grep "changetype: delete" $FILE1 | wc -l`"
rm -rf $FILE2
cp -r -p $FILE1 $FILE2
sed -i 's|,dc=im,dc=etasa|,dc=im,dc=eta|g' $FILE2
ls -lart $FILE1
ls -lart $FILE2

echo ""
echo "################################### SET ADS MAX CONNECTIONS IN POOL SIZE ################################################################"
IMPS_HOST=192.168.242.135
IMPS_PORT=20389
IMPS_USER='eTGlobalUserName=etaadmin,eTGlobalUserContainerName=Global Users,eTNamespaceName=CommonObjects,dc=im,dc=eta'
IMPS_PWD="Password01"
LDAPTLS_REQCERT=never dxmodify  -H ldap://$IMPS_HOST:$IMPS_PORT -c -x -D "$IMPS_USER" -w "$IMPS_PWD"  << EOF
dn: eTADSDirectoryName=$NAMESPACE,eTNamespaceName=ActiveDirectory,dc=im,dc=eta
changetype: modify
eTADSMaxConnectionsInPool: 200
EOF
LDAPTLS_REQCERT=never dxsearch -LLL  -H ldap://$IMPS_HOST:$IMPS_PORT -x -D "$IMPS_USER" -w "$IMPS_PWD" -b "eTADSDirectoryName=$NAMESPACE,eTNamespaceName=ActiveDirectory,dc=im,dc=eta" -s base eTADSMaxConnectionsInPool | perl -p00e 's/\r?\n //g'

echo ""
echo "################################### CCS UNIT & PERF TEST ################################################################"
CCS_HOST=192.168.242.80
CCS_PORT=20402
CCS_USER="cn=root,dc=etasa"
CCS_PWD="Password01"
echo "Execute this command to the CCS Service to test single thread with dxmodify or ldapmodify"
echo "dxmodify  -H ldap://$CCS_HOST:$CCS_PORT -c -x -D $CCS_USER -w $CCS_PWD -f $FILE1 "
echo "Execute this command to the CCS Service to test 100 threads with dxsoak "
echo "./dxsoak -c -l 60 -t 100 -h $CCS_HOST:$CCS_PORT -D $CCS_USER -w $CCS_PWD -f $FILE1 "

echo ""
echo "################################### JCS UNIT & PERF TEST ################################################################"
CCS_HOST=192.168.242.80
CCS_PORT=20410
CCS_USER="cn=root,dc=etasa"
CCS_PWD="Password01"
echo "Execute this command to the JCS Service to test single thread with dxmodify or ldapmodify "
echo "dxmodify  -H ldap://$CCS_HOST:$CCS_PORT -c -x -D $CCS_USER -w $CCS_PWD -f $FILE1 "
echo "Execute this command to the JCS Service to test 100 threads with dxsoak "
echo "./dxsoak -c -l 60 -t 100 -h $CCS_HOST:$CCS_PORT -D $CCS_USER -w $CCS_PWD -f $FILE1 "


echo ""
echo "################################### IMPS UNIT & PERF TEST ################################################################"
IMPS_HOST=192.168.242.135
IMPS_PORT=20389
IMPS_USER='eTGlobalUserName=etaadmin,eTGlobalUserContainerName=Global Users,eTNamespaceName=CommonObjects,dc=im,dc=eta'
IMPS_PWD="Password01"
echo "Execute this command to the IMPS Service to test single thread with dxmodify or ldapmodify "
echo "dxmodify  -H ldap://$IMPS_HOST:$IMPS_PORT -c -x -D \"$IMPS_USER\" -w $IMPS_PWD -f $FILE2 "
echo "Execute this command to the IMPS Service to test 100 threads with dxsoak "
echo "./dxsoak -c -l 60 -t 100 -h $IMPS_HOST:$IMPS_PORT -D \"$IMPS_USER\" -w $IMPS_PWD -f $FILE2 "



Address the new bottleneck of MS Exchange / O365 Provisioning.

After parallel provisioning has been introduced with the new im_ccs.exe service, you may noticed that the number of transactions is still being throttled during performance testing.

Out-of-the-box MS Active Directory Global Throttling Policy has the parameter of PowerShellMaxConcurrency set to a default of 18 connection. Any provisioning that uses MS Powershell for MS Exchange and/or MS O365 will be impacted by this default parameter.

To address this bottleneck, we can create a new Throttling Policy and only assign the service ID that will be managing identities, to avoid a global change.

Example: New-ThrottlingPolicy MaxPowershell -PowerShellMaxConcurrency 100 & Set-Mailbox “User Name” -ThrottlingPolicy MaxPowershell

After this change has been made, restart the IM JCS/CCS Services, and retest again with your performance tools. Review the CCS ADS log for # of creations in 60 seconds, and you will be pleasantly surprise at the rate. The logs are the strong confirmation we are looking for.

Performance test (947 ADS accounts w/Exchange mailboxes in 60 seconds, 08:59:54 to  09:00:53) => Rate of 15 ids/second   (or 54 K ids/hr) with updated MaxPowershell = 100 thottlingpolicy.

The last bottleneck appears to be CPU availability to MS Exchange Supporting Services, w3wp.exe, the MS IIS Service. Which appears to be managing MS Powershell connections per its startup string of

" c:\windows\system32\inetsrv\w3wp.exe -ap "MSExchangePowerShellAppPool" -v "v4.0" -c "C:\Program Files\Microsoft\Exchange Server\V15\bin\GenericAppPoolConfigWithGCServerEnabledFalse.config" -a \.\pipe\iisipme304c50e-6b42-4b26-83a4-229ee037be5d -h "C:\inetpub\temp\apppools\MSExchangePowerShellAppPool\MSExchangePowerShellAppPool.config" -w "" -m 0"

LDAP MITM Methodology to isolate data challenge

The Symantec (CA/Broadcom) Directory solution provides a mechanism for routing LDAPv3 traffic to other solutions. This routing mechanism allows Symantec Directory to act as a virtual directory service for other directories, e.g., MS Active Directory, SunOne, Novell eDirectory, etc.


The Symantec Identity Suite solution uses the LDAP protocol for its mid-tier and connector-tier components. The Provisioning Server is exposed on TCP 20389/20390, the JCS (Java Connector Server) is exposed on TCP 20410/20411, and the CCS (C++ Connector Server) is exposed on TCP 20402/20403.


We wished to isolate provisioning data challenges within the Symantec Identity Management solution that was not fully viewable using the existing debugging logs & features of the provisioning tier & connector tiers. Using Symantec Directory, we can leverage the routing mechanism to build a MITM (man-in-the-middle) methodology to track all LDAP traffic through the Symantec Identity Manager connector tier.


We focused on the final leg of provisioning and created a process to track the JCS -> CCS LDAP traffic. We wanted to understand what and how the data was being sent from the JCS to the CCS to isolate issues to the CCS service and MS Active Directory. Using the trace level of Symantec Directory, we can capture all LDAP traffic, including binds/queries/add/modify actions.

The below steps showcase how to use Symantec Directory as an approved MITM process for troubleshooting exercises. We found this process more valuable than deploying Wireshark on the JCS/CCS Server and decoding the encrypted traffic for LDAP.

Background:

Symantec Directory documentation on routing. Please note the concept / feature of “set transparent-routing = true;” to avoid schema challenges when routing to other directory/ldap solutions.

https://techdocs.broadcom.com/us/en/symantec-security-software/identity-security/directory/14-1/ca-directory-concepts/directory-distribution-and-routing.html

MITM Methodology for JCS->CCS Service:

The Symantec Identity Management connector tier may be deployed on MS Windows or Linux OS. If the CCS service is being used, then MS Windows OS is required for this MS Visual C++ component/service. As we are focused on the CCS service, we will introduce the Symantec Directory solution on the same MS Windows OS.

NOTE: We will keep the MITM process contained on a single host, and will not redirect the network traffic beyond the host.

Step 1: Deploy the latest Symantec Directory solution on MS Windows OS. This deployment is a blank slate for the next steps to follow.

Step 2: Copy the folders of schema, limits, and ssld from an existing Symantec Directory deployment of the Symantec Identity Manager solution. Using the existing schema files, references, and certificates will allow us to avoid any challenges during startup of the Router DSA due to the pre-defined provisioning/connector tier configurations. Please note when copying from a Linux OS version of Symantec Directory, we will need to update the path from Linux format to MS Windows format in the SSLD impd.dxc file for “cert-dir” and “ca-file” parameters.

# DXserver/config/ssld/impd.dxc

set ssl = {
cert-dir = "C:\Program Files\CA\Directory\dxserver\config\ssld\personalities"
ca-file = "C:\Program Files\CA\Directory\dxserver\config\ssld\impd_trusted.pem"
cipher = "HIGH:!SSLv2:!EXP:!aNULL:!eNULL"
#protocol = tlsv12
fips = false
};

Step 3: Create a new Router DSA DXI configuration file. This is the primary configuration file for Symantec Directory DSA. It will referenced the schema, knowledge, limits, and certificates for the DSA. Note the parameters for “transparent-routing” to avoid schema challenges with other solutions. Note the trace level used to trace the LDAP traffic in the Symantec Directory Router DSA trace log.

# DXserver/config/servers/admin_router_ccs_30402.dxi

# logging and tracing 
close summary-log; 
close trace-log; 
source "../logging/default.dxc"; 
 
# schema 
clear schema; 
source "../schema/impd.dxg";
 
# access controls 
clear access; 
# source "../access/"; 
 
# ssld
source "../ssld/impd.dxc";

# knowledge 
clear dsas; 
source "../knowledge/admin_router_ccs_group.dxg"; 
 
# operational settings 
source "../settings/default.dxc"; 
 
# service limits 
source "../limits/impd.dxc"; 

# database  - none - transparent router
set transparent-routing=TRUE;

# tunnel through eAdmin server error code and  messages
set route-non-compliant-ldap-error-codes = true;

set trace=ldap,time,stats;
#set trace=dsa,time;

Step 4: Create the three (3) knowledge files. The “group” knowledge file will be used to redirect to the other two (2) knowledge files of the router DSA and the re-direct DSA to the CCS service.

# DXserver/config/knowledge/admin_router_ccs_group.dxg 
# The admin_router_ccs_30402.dxc PORT 30402 
# will be used for the IAMCS (JCS) CCS port override configuration file
# server_ccs.properties via proxyConnectionConfig.proxyServerPort=30402

source "admin_router_ccs_30402.dxc";
source "admin_ccs_server_01.dxc";
# DXserver/config/knowledge/admin_router_ccs_30402.dxc 
# This file is sourced by admin_router_ccs_group.dxg.
 
set dsa admin_router_ccs_30402 =  
{ 
    prefix        = <> 
    dsa-name      = <dc etasa><cn admin_router_ccs_30402> 
    dsa-password  = "secret"
    address       = ipv4 localhost port 30402
    snmp-port     = 22500
    console-port  = 22501
    auth-levels   = clear-password
    dsp-idle-time = 100000 
    trust-flags = allow-check-password, trust-conveyed-originator
    link-flags    = ssl-encryption-remote
};
# DXserver/config/knowledge/admin_ccs_server_01.dxc
# This file is sourced by admin_router_ccs_group.dxg.

set dsa admin_ccs_server_01 =  
{ 
     prefix        = <dc etasa> 
     dsa-name      = <dc etasa><cn admin_ccs_server_01> 
     dsa-password  = "secret"
     address       = ipv4 localhost port 20402
     auth-levels   = clear-password
     dsp-idle-time = 100000
     dsa-flags     = load-share
     trust-flags   = allow-check-password, no-server-credentials, trust-conveyed-originator
     link-flags    = dsp-ldap
     #link-flags    = dsp-ldap, ssl-encryption
     # Note:  ssl will require update to /etc/hosts with:  <IP_Address>  eta_server

};

Step 5: Update the JCS configuration file that contains the TCP port that we will be redirecting to. In this example, we will declare TCP 30402 to be the new port.

#C:\Program Files (x86)\CA\Identity Manager\Connector Server\jcs\conf\override\server_ccs.properties

ccsWindowsController.ccsScriptPath=C:\\Program Files (x86)\\CA\\Identity Manager\\Connector Server\\ccs\\bin
proxyCCSManager.enabled=true
proxyCCSManager.startupWait=30
proxyConnectionConfig.proxyServerHostname=localhost
#proxyConnectionConfig.proxyServerPort=20402
proxyConnectionConfig.proxyServerPort=30402
proxyConnectionConfig.proxyServerUser=cn=root,dc=etasa
proxyConnectionConfig.proxyServerPassword={AES}pbj27RvWGakDKCr+DhRH4Q==
proxyConnectionConfig.proxyServerUseSsl=false
proxyCCSManager.controller.ref=ccsWindowsController

Overview of all files updated and their relationship to each other.

Validation

Start up the solution in the following order. Ensure that the new Symantec Directory Router DSA is starting with no issue. If there are any syntax issues, isolate them with the command: dxserver -d start DSA_NAME.

Start the Router DSA first, then restart the im_jcs (JCS) service. The im_ccs (CCS) service will be auto-started by the JCS service. Wait one (1) minute, then check that both TCP Ports 20402 (CCS) and 30402 (Router DSA) are both in the LISTEN state. If we do not see these both ports, please stop and restart these services.

May use MS Sysinternals ProcessExplorer to monitor both services and using the TCP/IP tab, to view which ports are being used.

A view of the im_ccs.exe and dxserver.exe services and which TCP ports they are listening on.

Use a 3rd party LDAP client tool, such as Jxplorer to authenticate to both the CCS and the Router DSA ports, with the embedded service ID of “cn=root,dc=etasa”. We should see exactly the SAME data.

Use the IME or IMPS to perform a query to MS Active Directory (or any other endpoint that uses the CCS connector tier). We should now see the “cache” on the CCS service be populated with the endpoint information, and the base DN structure. We can now track all LDAP traffic through the Router DSA MITM process.

View of trace logs

We can monitor when the JCS first binds to the CCS service.

We can monitor when the IMPS via the JCS queries if the CCS is aware of the ADS endpoint.

Finally, we can view when the IMPS service decrypt its stored information on the Active Directory endpoint, and push this information to the CCS cache, to allow communication to MS Active Directory. Using Notepad++ we can tail the trace log.

Please note, this is a secure LDAP/S tunnel from the IMPS -> JCS -> CCS -> MS ADS.

We can now view how this data is pushed via this secure tunnel with the MITM process.

> [88] 
> [88] <-- #1 LDAP MESSAGE messageID 5
> [88] AddRequest
> [88]  entry: eTADSDirectoryName=ads2016,eTNamespaceName=ActiveDirectory,dc=im,dc=etasa
> [88]  attributes:
> [88]   type: eTADSobjectCategory
> [88]   value: CN=Domain-DNS,CN=Schema,CN=Configuration,DC=exchange,DC=lab
> [88]   type: eTADSdomainFunctionality
> [88]   value: 7
> [88]   type: eTADSUseSSL
> [88]   value: 3
> [88]   type: eTADSexchangeGroups
> [88]   value: CN=Mailbox Database 0840997559,CN=Databases,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=ExchangeLab,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=exchange,DC=lab
> [88]   value: CN=im,CN=Databases,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=ExchangeLab,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=exchange,DC=lab
> [88]   type: eTLogWindowsEventSeverity
> [88]   value: FE
> [88]   type: eTAccountResumable
> [88]   value: 1
> [88]   type: eTADSnetBIOS
> [88]   value: EXCHANGE
> [88]   type: eTLogStdoutSeverity
> [88]   value: FE
> [88]   type: eTLog
> [88]   value: 0
> [88]   type: eTLogUnicenterSeverity
> [88]   value: FE
> [88]   type: eTADSlockoutDuration
> [88]   value: -18000000000
> [88]   type: objectClass
> [88]   value: eTADSDirectory
> [88]   type: eTLogETSeverity
> [88]   value: FE
> [88]   type: eTADSmsExchSystemObjectsObjectVersion
> [88]   value: 13240
> [88]   type: eTADSsettings
> [88]   value: 3
> [88]   type: eTADSconfig
> [88]   value: ExpirePwd=0
> [88]   value: HomeDirInheritPermission=0
> [88]   type: eTLogDestination
> [88]   value: F
> [88]   type: eTADSUserContainer
> [88]   value: CN=BuiltIn;CN=Users
> [88]   type: eTADSbackupDirs
> [88]   value: 000;DEFAULT;192.168.242.156;0
> [88]   value: 001;DEFAULT;dc2016.exchange.lab;0
> [88]   value: 002;site1;server1.domain.com;0
> [88]   value: 003;site1;server2.domain.com;0
> [88]   value: 004;site2;server3.domain.com;0
> [88]   value: 005;site2;server4.domain.com;0
> [88]   type: eTADSuseFailover
> [88]   value: 1
> [88]   type: eTLogAuditSeverity
> [88]   value: FE
> [88]   type: eTADS-DefaultContext
> [88]   value: exchange.lab
> [88]   type: eTADSforestFunctionality
> [88]   value: 7
> [88]   type: eTADSAuthDN
> [88]   value: Administrator
> [88]   type: eTADSlyncMaxConnection
> [88]   value: 5
> [88]   type: eTADShomeMTA
> [88]   value: CN=Microsoft MTA,CN=EXCHANGE2016,CN=Servers,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=ExchangeLab,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=exchange,DC=lab
> [88]   type: eTADSAuthPWD
> [88]   value: CAdemo123
> [88]   type: eTADSexchangelegacyDN
> [88]   value: /o=ExchangeLab/ou=Exchange Administrative Group (FYDIBOHF23SPDLT)/cn=Configuration/cn=Servers/cn=EXCHANGE2016/cn=Microsoft Private MDB
> [88]   type: eTLogFileSeverity
> [88]   value: F
> [88]   type: eTADSprimaryServer
> [88]   value: dc2016.exchange.lab
> [88]   type: eTADScontainers
> [88]   value: CN=Builtin,DC=exchange,DC=lab
> [88]   value: CN=Computers,DC=exchange,DC=lab
> [88]   value: OU=Domain Controllers,DC=exchange,DC=lab
> [88]   value: OU=Explore,DC=exchange,DC=lab
> [88]   value: CN=ForeignSecurityPrincipals,DC=exchange,DC=lab
> [88]   value: CN=Keys,DC=exchange,DC=lab
> [88]   value: CN=Managed Service Accounts,DC=exchange,DC=lab
> [88]   value: OU=Microsoft Exchange Security Groups,DC=exchange,DC=lab
> [88]   value: OU=o365,DC=exchange,DC=lab
> [88]   value: OU=People,DC=exchange,DC=lab
> [88]   value: CN=Program Data,DC=exchange,DC=lab
> [88]   value: CN=Users,DC=exchange,DC=lab
> [88]   value: DC=ForestDnsZones,DC=exchange,DC=lab
> [88]   value: DC=DomainDnsZones,DC=exchange,DC=lab
> [88]   type: eTADSTimeBoundMembershipsEnabled
> [88]   value: 0
> [88]   type: eTADSexchange
> [88]   value: 1
> [88]   type: eTADSdomainControllerFunctionality
> [88]   value: 7
> [88]   type: eTADSexchangeStores
> [88]   value: CN=EXCHANGE2016,CN=Servers,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=ExchangeLab,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=exchange,DC=lab
> [88]   value: CN=Mailbox,CN=Transport Configuration,CN=EXCHANGE2016,CN=Servers,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=ExchangeLab,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=exchange,DC=lab
> [88]   value: CN=Frontend,CN=Transport Configuration,CN=EXCHANGE2016,CN=Servers,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=ExchangeLab,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=exchange,DC=lab
> [88]   type: eTADSKeepCamCaftFiles
> [88]   value: 0
> [88]   type: eTADSmsExchSchemaVersion
> [88]   value: 15333
> [88]   type: eTADSCamCaftTimeout
> [88]   value: 0000001800
> [88]   type: eTADSMaxConnectionsInPool
> [88]   value: 0000000101
> [88]   type: eTADSPortNum
> [88]   value: 389
> [88]   type: eTADSDCDomain
> [88]   value: DC=exchange,DC=lab
> [88]   type: eTADSServerName
> [88]   value: 192.168.242.156
> [88]   type: eTADSDirectoryName
> [88]   value: ads2016
> [88]   type: eTAccountDeletable
> [88]   value: 1
> [88] controls:
> [88]   controlType: 2.16.840.1.113730.3.4.2
> [88]   non-critical

We can now monitor all traffic and assist with troubleshooting any CCS/MS-ADS challenges.

This same MITM methodology/process may also be used for the IMPS (TCP 20389/2039) and the JCS (TCP 20410/20411) services. We have used this process to capture the IME (JIAM) LDAP traffic to the IMPS Service, to isolate multiple queries for Child Provisioning Roles. Which has been used by the product team to enhance the solution to lower startup durations of the IME in the latest releases.

Binds/queries/add/modification all work with this approach, but we do see an issue with OID for IMPS ADS endpoint “explore process” on ADS OU object. We are reviewing how to address this last challenge that states “critical extension is unavailable” for a LDAP control property of the OU object. The OIDs captured appear to be related to SunOne/Iplanet.

Authenticate to vApp ‘dsa’ user ID via ssh private key

The Symantec (CA) Identity Suite includes the Symantec (CA) Directory. This component is installed under the ‘dsa’ service ID. On the virtual appliance, this ‘dsa’ service ID does not have a password defined, and therefore no login is allowed.

As an enhancement, we would like to add in a SSH private key to allow authentication to the ‘dsa’ service ID from other virtual appliances and desktop usage with various tools, e.g. Putty, MobaXterm, WinSCP, etc. This enhancement will allow for a streamlined process to address out-of-sync Directory DATA DSAs with scp/Rsync copies without intermediate file shares or use of other service IDs.

Challenge:

The virtual appliance of Symantec (CA) Identity Suite r14.3 is built on CentOS 6.4. The OpenSSH services on this OS apparently do not use a private key format that can be used by desktop tools or the PuttyGen (keygen conversion tool). However, the private key may be used between vApp servers if using the FQDN (full qualified domain name). We noted that during testing, that localhost is not allowed due to localhost not defined in the SSHD “AllowedUsers” property file.

On newer virtual appliances vApp r14.4 with CentOS 8 Stream, this challenge does not exist, and we can use the OpenSSH private key, id_rsa, with the desktop tools as-is.

To assist with challenge and streamlining this process we have the following three (2) options:

Option 1: On newer OS, use OpenSSH process

After creating the private key, ./ssh/id_rsa, cat this file out to notepad, and save for use with the desktop tools

Generate this OpenSSH private/public key. The final command will help to validate this private key may be used for server to server communication.

echo y | ssh-keygen -t rsa -b 4096 -N Password02 -C "$USER@$HOSTNAME" -f .ssh/id_rsa ; ls -lart .ssh ; cat .ssh/id_rsa ; cat .ssh/id_rsa.pub >> .ssh/authorized_keys ; chmod 600 .ssh/authorized_keys ; ssh -v -i .ssh/id_rsa $USER@`hostname`

Option 2: Skip the OpenSSH process, use PuttyGen

On any OS (new/old) just use Putty-Gen tool to generate the private key. Update key comment/passphrase. After the private key is created, copy the TEXT “Public Key for pasting into OpenSSH authorized_keys file”. Just like it says, and then you may use the associated private key, id_rsa.ppk, with the desktop tools for the ‘dsa’ service ID.

Option 3: Combination of processes/tools

Important: .ssh/authorized_keys is updated and not overwritten.

Be kind to your auditors – Streamline Adhoc Reports

One of the challenges that IAM/IAG teams may have every few months is delivery or access for internal/external auditors to validate access within the IAM/IAG system and their managed endpoints.

Usually, auditors may directly access the 100’s system/endpoints/applications and randomly select a few or export the entire directory structure to review access. This effort takes time and possible 100’s of entitlements to grant temporary/expiry access to view. Auditors also prefer Excel or CSV files to review rather than fixed documents (PDF) to allow them to filter and isolate what interests them.

One process that may have value for your team is various tools with export functionality to CSV/XLS and the ability to query the 100’s-1000’s of systems from a single entry point.

A tool that we have found valuable over the years is SoftTerra LDAP Browser.

https://www.ldapadministrator.com/softerra-ldap-browser.htm

The multiple benefits from this tool for IAM/IAG are:

  1. It is a read-only tool, so no mistakes can be made by granting too much access.
  2. It has the ability to save queries that are popular and can be copied from other tools.
  3. It has the ability to export the queries to CSV/XLS formats (plus others)
  4. It can be used to pull reports from an IAM/IAG solution via their directory ports.
  5. It can be used to pull reports from the managed applications (on-prem or SaaS) via the IAM provisioning directory ports.
  6. The tool is free from SoftTerra, it is a limited version of their Administration tool
Example of the SoftTerra LDAP Browser tool used to query Active Directory, LDAP user stores, and Provisioning User Store & managed endpoints/applications.

A view to export Service Now (SNOW) accounts via the CA/Symantec Identity Manager Provisioning Server/Service (TCP 20390) via the LDAP/S protocol.

Why? The provisioning server may be viewed as a virtual directory/pass-through directory to the managed endpoints via its connector tier.

The below image shows SoftTerra LDAPBrowser used to connect to the Provisioning Server (TCP 20390). Then navigate to a Service Now (SNOW) managed endpoint, to query on all accounts and their respective profiles & entitlements. This same report/extract process may be done for mainframe/AS400 and client-server applications, e.g Active Directory, Unix, Databases, etc.

Enhance this process with defense-in-depth

We will not use the primary default administration account of the provisioning tier, “etaadmin”. Since this account has full access to change data.

Within the IAM/IAG solution, create an auditor account.

In the example below we create a new Global User, with the name “auditor”, a description, password, and a local “read-only admin profile” with an expiration date. This will allow the auditors to use the account as they wish (or you may grant this “read-only admin profile directly to their existing Global User ID). The account may still follow the same password reset expiration processes. If the account is marked as “restricted” in the CA/Symantec IM solution, then this account is limited how it may be changed to avoid any unexpected sync challenges to managed endpoints (if it was correlated to other accounts).

After the new Global User is created (or existing ID is added to the Admin Profile “ReadAdministrator”), update SoftTerra Credentials for the Provisioning Service. Below the new DN with “auditor” is shown in the credentials for login ID, e.g. “eTGlobalUserName=auditor,eTGlobalUserContainerName=Global Users,eTNamespaceName=CommonObjects,dc=im,dc=eta”

Now, the auditors may run as many reports as they would like, and export to spreadsheets or PDF files using a read-only account with a read-only tool.

Honorable mentions for other query tools.

Jxplorer is a useful & free java-based tool for reports, but this is a full edit tool & only exports out to LDIF format. http://jxplorer.org/

Apache Directory Studio is another very useful & free java-based tool for reports. This is a full edit tool. It does have the ability to export to many different formats. Since this tool does NOT need an MS Windows installer, and if the Desktop prevent installation, this is typically our 2nd choice to use. Extract and use the current java on the MS Windows OS or download AdoptOpenJDK and extract it to use with Apache Directory Studio. https://directory.apache.org/studio/ & https://adoptopenjdk.net/

SoftTerra LDAP Administrator is a paid and full edit tool. It has the same look-n-feel of the SoftTerra LDAP Browser tool. It is typically used by administrators of various LDAP solutions. We recommend this tool for your larger sites or if you would like a fast responsive tool on MS Windows OS. https://www.ldapadministrator.com/

If you have other recommendations, please leave a response.

Bonus Feature – SoftTerra AD Authentication

Both the SoftTerra tools allow binding using your existing authentication (on your desktop/laptop) into Active Directory. No need to create additional user ID for the auditors or yourself.

Perhaps the O365 or Outlook contacts process is not robust or too slow or perhaps you wish you had a more detail view of your internal active directory to view a manager’s direct reports. You can use this feature to view the the non-privacy attributes of your domain of all accounts with a read-only tool.

Step 01: Open a command-line prompt on your desktop/workstation after you have authenticated to your Active Directory domain & type set | findstr LOGONSERVER

Step 02: Install SoftTerra LDAP Browser Tool & Create a new profile

Step 03: Type the name of the Active Directory LOGONSERVER (aka Domain Controller) into the following fields & ensure “Use Secure Connection (SSL)” is selected (to avoid query issues).

Step 04: Click Next until you see “User Authentication Information” then select the radio button for “Currently logged on user (Active Directory)”, then click Finish button.

Step 05: After the profile is built, now click on the profile and watch it expand into a tree display of Active Directory. Select the branch that you believe has the list of users you would like to view, then select an individual user account, to see the values populated.

Step 06: If you wish to export this data to a spreadsheet (CSV/XLS), right click on the left object and select export option.

Step 07: You will have a series of options to export to & the file name it will write to.

Step 08: Advance search and export process. Select the branch that holds all the users you wish to view and export. Note: If the branch has 10,000 objects, this process may take minutes to complete depending on the query.

Step 09: The follow search windows will appear to help you create, save, and export your queries. Note that if you start to type in the field name, the list of the fields will start to appear.

Step 10: Ensure the FILTER is properly formed (use google to assist), and which attribute you wish to view or export is defined, then click search. If you are satisfied with your search, use the “Save Results” to export to a spreadsheet (CSV/XLS) or other format.