Leveling Up: The Imperative of Upgrading Your Symantec Identity Suite Virtual Appliance to 14.5 (Centos Stream 9) for Robust Randomness, Enhanced Jitterentropy, and Bouncy Castle Entropy Insights

In the intricate world of cybersecurity and identity management, evolving threats and vulnerabilities demand our undivided attention. When considering upgrading your Symantec Identity Suite Virtual Appliance, understanding the nuanced technological landscape, including the perks of Jitterentropy and the challenges associated with Java’s Bouncy Castle entropy, can make a world of difference.

The Technological Need:

  1. Robust Randomness with Jitterentropy: Relying on the natural timing jitter of CPUs, Jitterentropy has emerged as a game-changing hardware random number generator (RNG). The latest renditions of the Symantec Identity Suite Virtual Appliance leverage this RNG, ensuring unparalleled randomness, making decoding by potential threats a herculean task.
  2. Operational Efficiency: Upgrades tuned with contemporary features promise optimized performance. Coupled with Jitterentropy, the RNG processes are turbocharged, promising minimal downtime and an elevated user experience.
  3. Challenges with Bouncy Castle Entropy in Java: Bouncy Castle, despite its vast utility in cryptographic operations in Java, has had its share of entropy-related issues. Some known problems include:
  • Predictability: Certain RNG implementations in Bouncy Castle have been found to be somewhat predictable, which could compromise security.
  • Seed Reuse: There have been instances where seeds were reused, which again poses security concerns.
  • Slow Entropy Accumulation: At times, the entropy collection is slower than expected, leading to potential operational delays. With security solutions the lack of entropy impacts scale and usability.

Business Justification for Rapid Response:

With the business landscape in perpetual flux, the right tech decisions can spell the difference between stagnation and growth:

  1. Enhanced Security: Incorporating Linux OS with Jitterentropy is synonymous with state-of-the-art security. Such forward-thinking measures drastically curtail potential security breaches.
  2. Cost Savings: Forward-looking upgrades, especially those that incorporate cutting-edge features like Jitterentropy, offer tangible long-term financial advantages. Fewer breaches, reduced system errors, and saved manual efforts contribute positively to the bottom line.
  3. Staying Competitive: In an era of rapid technological advancements, integrating elements like Jitterentropy ensures you’re leading from the front.
  4. Compliance and Regulatory Adherence: With cybersecurity standards constantly on the move, staying updated is non-negotiable. Evade potential legal issues and hefty fines by staying on top of these norms.
  5. Customer Trust: By showcasing a commitment to data safety through advanced systems (and by addressing known entropy issues like those in Bouncy Castle), businesses can strengthen customer trust and foster long-term loyalty.

Validating Jitterentropy Integration in the Linux Kernel: A Comprehensive Guide

As the world of Linux continues to evolve, one exciting development is the incorporation of jitterentropy into the kernel. This robust hardware random number generator (RNG) enhances the quality of randomness, making our systems even more secure. If you’re keen on understanding, implementing, or validating this feature in your Linux setup, this guide is tailored just for you.

What is Jitterentropy?

Jitterentropy is an RNG based on the natural timing jitter that occurs in CPUs. In the realm of cybersecurity, RNGs are of paramount importance; they generate the random numbers pivotal for cryptographic operations. The less predictable these numbers are, the tougher it becomes for malicious actors to crack them.

Why is Jitterentropy Essential?

For systems relying on cryptographic functions, such as encryption, the RNG’s caliber can’t be overstated. Jitterentropy guarantees first-rate randomness, upping your system’s security game. https://www.chronox.de/jent.html

How to Validate Jitterentropy Integration:

  1. Identify Your Kernel Version:
    Kick things off by determining your kernel version using the uname -r or uname -acommand.
   uname -r

This will provide insights into your system’s hostname, kernel version, build date, and architecture. You can deterermine if your Linux kernel is greater than 5.6, when entropy functionality was added directly to the kernel. https://github.com/torvalds/linux/commit/3f2dc2798b81531fd93a3b9b7c39da47ec689e55

  1. Is Jitterentropy Part of Your Kernel Configuration?:
    Deploy this simple grep command to figure out if jitterentropy is enabled in your kernel:
   grep -HRin jitter /boot/config*

An output showing CONFIG_CRYPTO_JITTERENTROPY=y confirms that jitterentropy is enabled. The “y” here indicates that the feature is in-built in the kernel.

  1. Time-Driven Testing for Jitterentropy:
    By simulating multiple pulls from the entropy source, you can gauge how efficient jitterentropy is:
   time for i in {1..1000}; do time dd if=/dev/random bs=1 count=16 2>/dev/null | base64; done

This command performs two functions:

  • It times each of the 1000 pulls from /dev/random, allowing you to measure the average time taken, basically emulating 1000 rapid password changes of 16 characters.
  • It provides an overall timing for 1000 pulls, letting you know the total duration for the entire operation. If your system remains responsive and completes the pulls swiftly, it’s a strong indication that your entropy source is in prime working condition. Which implies that any solution on the appliance has adequate entropy to service users and processes to scale.

Another command that add counters to see that 1000 iteration have passed. Note, if there is no entropy pump, this process will NOT succeed. The Linux OS entropy will be rapidly depleted and any solution on the host will be delayed. Ensure there is an entropy pump to keep the performance you need.

counter=1;MAX=1000;time while [ $counter -le $MAX ]; do echo "##########  $counter ##########" ; time dd if=/dev/random bs=16 count=1 2> /dev/null | base64; counter=$(( $counter + 1 )); done;

Wrapping Up:

The integration of Jitterentropy in the Linux kernel underscores the open-source community’s relentless dedication to fortifying security. By understanding, testing, and leveraging it, you ensure that your system is bolstered against potential threats, always staying a step ahead in the cybersecurity arena. Keep exploring, stay updated, and most importantly, remain secure!

Review upgrade your Symantec Identity Suite to improve your performance for users and scale to millions of transactions.

For non-appliances or older Linux OS (Kernel release < 5.6):

Review adding the haveged or jitterentropy packages to your Linux OS, to avoid delays to any business processes. See prior blog discussing entropy, of how adding an entropy pump to your Linux OSes has value. https://anapartner.com/2021/06/25/the-hidden-cost-of-entropy-to-your-business/

Global Password Reset

The recent DNS challenges for a large organization that impacted their worldwide customers bring to mind a project we completed this year, a global password reset redundancy solution.

We worked with a client who desired to manage unplanned WAN outages to their five (5) data centers for three (3) independent MS Active Directory Domains with integration to various on-prem applications/ endpoints. The business requirement was for self-service password sync, where the users’ password change process is initialed/managed by the two (2) different MS Active Directory Password Policies.

Without the WAN outage requirement, any IAM/IAG solution may manage this request within a single data center. A reverse password sync agent process is enabled on all writable MS Active Directory domain controllers (DC). All the world-wide MS ADS domain controllers would communicate to the single data center to validate and resend this password change to all of the users’ managed endpoint/application accounts, e.g. SAP, Mainframe (ACF2/RACF/TSS), AS/400, Unix, SaaS, Database, LDAP, Certs, etc.

With the WAN outage requirement, however, a queue or components must be deployed/enabled at each global data center, so that password changes are allowed to sync locally to avoid work-stoppage and async-queued to avoid out-of-sync password to the other endpoint/applications that may be in other data centers.

We were able to work with the client to determine that their current IAM/IAG solution would have the means to meet this requirement, but we wished to confirm no issues with WAN latency and the async process. The WAN latency was measured at less than 300 msec between remote data centers that were opposite globally. The WAN latency measured is the global distance and any intermediate devices that the network traffic may pass through.

To review the solution’s ability to meet the latency issues, we introduced a test environment to emulate the global latency for deployment use-cases, change password use-cases, and standard CrUD use-cases. There is a feature within VMWare Workstation, that allows emulation of degraded network traffic. This process was a very useful planning/validation tool to lower rollback risk during production deployment.

VMWare Workstation Network Adapter Advance Settings for WAN latency emulation

The solution used for the Global Password Rest solution was Symantec Identity Suite Virtual Appliance r14.3cp2. This solution has many tiers, where select components may be globally deployed and others may not.

We avoided any changes to the J2EE tier (Wildfly) or Database for our architecture as these components are not supported for WAN latency by the Vendor. Note: We have worked with other clients that have deployment at two (2) remote data centers within 1000 km, that have reported minimal challenges for these tiers.

We focused our efforts on the Provisioning Tier and Connector Tier. The Provisioning Tier consists of the Provisioning Server and Provisioning Directory.

The Provisioning Server has no shared knowledge with other Provisioning Servers. The Provisioning Directory (Symantec Directory) is where the provisioning data may be set up in a multi-write peer model. Symantec Directory is a proper X.500 directory with high redundancy and is designed to manage WAN latency between remote data centers and recovery after an outage. See example provided below.

https://techdocs.broadcom.com/us/en/symantec-security-software/identity-security/directory/14-1/ca-directory-concepts/directory-replication/multiwrite-mw-replication.html

The Connector Tier consists of the Java Connector Server and C++ Connector Server, which may be deployed on MS Windows as an independent component. There is no shared knowledge between Connector Servers, which works in our favor.

Requirement:

Three (3) independent MS Active Directory domain in five (5) remote data centers need to allow self-service password change & allow local password sync during a WAN outage. Passwords changes are driven by MS ADS Password Policies (every N days). The IME Password Policy for IAG/IAM solution is not enabled, IME authentication is redirected to an ADS domain, and the IMPS IM Callback Feature is disabled.

Below is an image that outlines the topology for five (5) global data centers in AMER, EMEA, and APAC.

The flow diagram below captures the password change use-case (self-service or delegated), the expected data flow to the user’s managed endpoints/applications, and the eventual peer sync of the MS Active Directory domain local to the user.

Observation(s):

The standalone solution of Symantec IAG/IAM has no expected challenges with configurations, but the Virtual Appliance offers pre-canned configurations that may impact a WAN deployment.

During this project, we identified three (3) challenges using the virtual appliance.

Two (2) items needed the assistance of the Broadcom Support and Engineering teams. They were able to work with us to address deployment configuration challenges with the “check_cluster_clock_sync -v ” process that incorrectly increments time delays between servers instead of resetting a value of zero between testing between servers.

Why this is important? The “check_cluster_clock_sync” alias is used during auto-deployment of vApp nodes. If the time reported between servers is > 15 seconds then replication may fail. This time check issue was addressed with a hotfix. After the hot-fix was deployed, all clock differences were resolved.

The second challenge was a deployment challenge of the IMPS component for its embedded “registry files/folders”. The prior embedded copy process was observed to be using standard “scp”. With a WAN latency, the scp copy operation may take more than 30 seconds. Our testing with the Virtual Appliance showed that a simple copy would take over two (2) minutes for multiple small files. After reviewing with CA support/engineering, they provided an updated copy process using “rsync” that speeds up copy performance by >100x. Before this update, the impact was provisioning tier deployment would fail and partial rollback would occur.

The last challenge we identified was using the Symantec Directory’s embedded features to manage WAN latency via multi-write HUB groups. The Virtual Appliance cannot automatically manage this feature when enabled in the knowledge files of the provisioning data DSAs. Symantec Directory will fail to start after auto-deployment.

Fortunately, on the Virtual appliance, we have full access to the ‘dsa’ service ID and can modify these knowledge files before/after deployment. Suppose we wish to roll back or add a new Provisioning Server Virtual Appliance. In that case, we must disable the multi-write HUB group configuration temporarily, e.g. comment out the configuration parameter and re-init the DATA DSAs.

Six (6) Steps for Global Password Reset Solution Deployment

We were able to refine our list of steps for deployment using pre-built knowledge files and deployment of the vApp nodes in blank slates with the base components of Provisioning Server (PS) and Provisioning Directory) with a remote MS Windows server for the Connector Server (JCS/CCS).

Step 1: Update Symantec Directory DATA DSA’s knowledge configuration files to use the multiple group HUB model. Note that multi-write group configuration is enabled within the DATA DSA’s *.dxc files. One Directory servers in each data center will be defined as a “HUB”.

Ref: https://techdocs.broadcom.com/us/en/symantec-security-software/identity-security/directory/14-1/ca-directory-concepts/directory-replication/multiwrite-mw-groups-hubs/topology-sample-and-disaster-recovery.html

To assist this configuration effort, we leveraged a serials of bash shell scripts that could be pasted into multiple putty/ssh sessions on each vApp to replace the “HUB” string with a “sed” command.

After the HUB model is enabled (stop/start the DATA DSAs), confirm that delayed WAN latency has no challenge with Symantec Directory sync processes. By monitoring the Symantec Directory logs during replication, we can see that sync operation with the WAN latency is captured with the delay > 1 msecs between data centers AMER1 and APAC1.

Step 2: Update IMPS configurations to avoid delays with Global Password Reset solution.

Note for this architecture, we do not use external IME Password Policies. We ensure that each AD endpoint has the checkbox enabled for “Password synchronization agent is installed” & each Global User (GU) has “Enable Password Synchronization Agent” checkbox enabled to prevent data looping. To ensure this GU attribute is always enabled, we updated an attribute under “Create Users Default Attributes”.

Step 3a: Update the Connector Tier (CCS Component)

Ensure that the MS Windows Environmental variables for the CCS connector are defined for Failover (ADS_FAILOVER) and Retry (ADS_RETRY).

Step 3b: Update the CCS DNS knowledge file of ADS DCs hostnames.

Important Note: Avoid using the refresh feature “Refresh DC List” within the IMPS GUI for the ADS Endpoint. If this feature is used, then a “merge” will be processed from the local CCS DNS file contents and what is defined within the IMPS GUI refresh process. If we wish to manage the redirection to local MS ADS Domain Controllers, we need to control this behavior. If this step is done, we can clean out the Symantec Directory of extra entries. The only negative aspect is the local password change may attempt to communicate to one of the remote MS ADS Domain Controllers that are not within the local data center. During a WAN outage, a user would notice a delay during the password change event while the CCS connector timed out the connection until it connected to the local MS ADS DC.

Step 3c: CCS ADS Failover

If using SSL over TCP 636 confirm the ADS Domain Root Certificate is deployed to the MS Windows Server where the CCS service is deployed. If using SASL over TCP 389 (if available), then no additional effort is required.

If using SSL over TCP 636, use the MS tool certlm.msc to export the public root CA Certificate for this ADS Domain. Export to base64 format for import to the MS Windows host (if not already part of the ADS Domain) with the same MS tool certlm.msc.

Step 4a: Update the Connector Tier for the JCS component.

Add the stabilization parameter “maxWait” to the JCS/CCS configuration file. Recommend 10-30 seconds.

Step 4b: Update JCS registration to the IMPS Tier

You may use the Virtual Appliance Console, but this has a delay when pulling the list of any JCS connector that may be down at this time of the check/submission. If we use the Connector Xpress UI, we can accomplish the same process much faster with additional flexibility for routing rules to the exact MS ADS Endpoints in the local data center.

Step 4c: Observe the IMPS routing to JCS via etatrans log during any transaction.

If any JCS service is unavailable (TCP 20411), then the routing rules process will report a value of 999.00, instead of a low value of 0.00-1.00.

Step 5: Update the Remote Password Change Agent (DLL) on MS ADS Domain Controllers (writable)

Step 6a: Validation of Self-Service Password Change to selected MS ADS Domain Controller.

Using various MS Active Directory processes, we can emulate a delegated or self-service password change early during the configuration cycle, to confirm deployment is correct. The below example uses MS Powershell to select a writable MS ADS Domain Controller to update a user’s password. We can then monitor the logs at all tiers for completion of this password change event.

A view of the password change event from the Reverse Password Sync Agent log file on the exact MS Domain Controller.

Step 6b: Validation of password change event via CCS ADS Log.

Step 6c: Validation of password change event via IMPS etatrans log

Note: Below screenshot showcases alias/function to assist with monitoring the etatrans logs on the Virtual Appliance.

Below screen shot showcases using ldapsearch to check timestamps for before/after of password change event within MS Active Directory Domain.

We hope these notes are of some value to your business and projects.

Appendix

Using the MS Windows Server for CCS Server 

Get current status of AD account on select DC server before Password Change:

PowerShell Example:

get-aduser -Server dc2012.exchange2020.lab   "idmpwtest"  -properties passwordlastset, passwordneverexpires | ft name, passwordlastset

LdapSearch Example:  (using ldapsearch.exe from CCS bin folder - as the user with current password.)

C:\> & "C:\Program Files (x86)\CA\Identity Manager\Connector Server\ccs\bin\ldapsearch.exe" -LLL -h dc2012.exchange2012.lab -p 389 -D "cn=idmpwtest,cn=Users,DC=exchange2012,DC=lab" -w "Password05" -b "CN=idmpwtest,CN=Users,DC=exchange2012,DC=lab" -s base pwdLastSet

Change AD account's password via Powershell:
PowerShell Example:

Set-ADAccountPassword -Identity "idmpwtest" -Reset -NewPassword (ConvertTo-SecureString -AsPlainText "Password06" -Force) -Server dc2016.exchange.lab

Get current status of AD account on select DC server after Password Change:

PowerShell Example:

get-aduser -Server dc2012.exchange2020.lab   "idmpwtest"  -properties passwordlastset, passwordneverexpires | ft name, passwordlastset

LdapSearch Example:  (using ldapsearch.exe from CCS bin folder - as the user with NEW password)

C:\> & "C:\Program Files (x86)\CA\Identity Manager\Connector Server\ccs\bin\ldapsearch.exe" -LLL -h dc2012.exchange2012.lab -p 389 -D "cn=idmpwtest,cn=Users,DC=exchange2012,DC=lab" -w "Password06" -b "CN=idmpwtest,CN=Users,DC=exchange2012,DC=lab" -s base pwdLastSet

Using the Provisioning Server for password change event

Get current status of AD account on select DC server before Password Change:
LDAPSearch Example:   (From IMPS server - as user with current password)

LDAPTLS_REQCERT=never  ldapsearch -LLL -H ldaps://192.168.242.154:636 -D 'CN=idmpwtest,OU=People,dc=exchange2012,dc=lab'  -w  Password05   -b "CN=idmpwtest,OU=People,dc=exchange2012,dc=lab" -s sub dn pwdLastSet whenChanged


Change AD account's password via ldapmodify & base64 conversion process:
LDAPModify Example:

BASE64PWD=`echo -n '"Password06"' | iconv -f utf8 -t utf16le | base64 -w 0`
ADSHOST='192.168.242.154'
ADSUSERDN='CN=Administrator,CN=Users,DC=exchange2012,DC=lab'
ADSPWD='Password01!’

ldapmodify -v -a -H ldaps://$ADSHOST:636 -D "$ADSUSERDN" -w "$ADSPWD" << EOF
dn: CN=idmpwtest,OU=People,dc=exchange2012,dc=lab 
changetype: modify
replace: unicodePwd
unicodePwd::$BASE64PWD
EOF

Get current status of AD account on select DC server after Password Change:
LDAPSearch Example:   (From IMPS server - with user's account and new password)

LDAPTLS_REQCERT=never  ldapsearch -LLL -H ldaps://192.168.242.154:636 -D 'CN=idmpwtest,OU=People,dc=exchange2012,dc=lab' -w  Password06   -b "CN=idmpwtest,OU=People,dc=exchange2012,dc=lab" -s sub dn pwdLastSet whenChanged

The hidden cost of Entropy to your business

On Linux OS, there are two (2) device drivers that provide entropy “noise” for components that require encryption, e.g. the /dev/random and the /dev/urandom device drivers. The /dev/random is a “blocking” device driver. When the “noise” is low, any component that relies on this driver will be “stalled” until enough entropy is returned. We can measure the entropy from a range of 0-4096. Where a value over 1000 is excellent. Any value in the double or single digits will impact the performance of the OS and solutions with delays. The root cause of these delays is not evident during troubleshooting, and typically there are no warning nor error messages related to entropy.

watch -n 1 cat /proc/sys/kernel/random/entropy_avail

The Symantec Identity Suite solution, when deployed on Linux OS is typically deployed with the JVM switch -Djava.security.egd=file:/dev/./urandom for any component that uses Java (Oracle or AdoptOpenJDK), e.g. Wildfly (IM/IG/IP) and IAMCS (JCS). This JVM variable is sufficient for most use-cases to manage the encryption/hash needs of the solution.

However, for any component that does not provide a mechanism to use the alternative of /dev/urandom driver, the Linux OS vendors offer tools such as the “rng-tools” package. We can review what OS RNGD service is available using package tools, e.g.

dnf list installed | grep -i rng

If the Symantec Identity Suite or other solutions are deployed as standalone components, then we may adjust the Linux OS as we need with no restrictions to add the RNGD daemon as we wish. One favorite is the HAVEGED daemon over the default OS RNGD.

See prior notes on value and testing for Entropy on Linux OS (standalone deployments):

https://community.broadcom.com/enterprisesoftware/communities/community-home/digestviewer/viewthread?GroupId=2197&MID=720771&CommunityKey=f9d65308-ca9b-48b7-915c-7e9cb8fc3295&tab=digestviewer

https://community.broadcom.com/HigherLogic/System/DownloadDocumentFile.ashx?DocumentFileKey=7747b411-2e1e-4bc2-8284-9b8856790ef9

Challenge for vApp

The challenge for Virtual Appliances is that we are limited to what functionality the Symantec Product Team provides for us to leverage. The RNGD service was available on the vApp r14.3, but was disabled for OS challenges with 100% utilization with CentOS 6.4. The service is still installed, but the actual binary is non-executable.

https://knowledge.broadcom.com/external/article/97774/ca-identity-suite-low-entropy-on-virtual.html
https://knowledge.broadcom.com/external/article/139759/ca-identity-suite-142-vapp-rngd-proces.html
https://broadcom-stage.adobecqms.net/us/en/symantec-security-software/identity-security/identity-suite/14-3/virtual-appliance/administering-virtual-appliance/using-the-login-shell.html

A new Virtual Appliance patch would be required to re-enable this RNGD on vApp r14.3cp2. We have access via sudo, to /sbin/chkconfig, /sbin/service to re-enable this service, but as the binary is not executable, we cannot progress any further. We can see the alias in the documentation still exist, but the OS alias was removed in the cp2 update.

However, since vApp r14.4 was release, we can focus on this Virtual Appliance which is running Centos 8 stream. The RNGD service here is disabled (masked) but can be re-enabled for our use with the sudo command. There is no current documented method for RNGD on vApp r14.4 at this time, but the steps below will show an approved way using the ‘config’ userID and sudo commands.

Confirm that the “rng-tools” package is installed and that the RNGD binary is executable. We can also see that the RNGD service is “masked”. Masked services are prevented from starting manually or automatically as an extra safety measure when we wish for tighter control over our systems.

If we test OS entropy for this vApp r14.4 server without RNGD, we can monitor how a simple BASH shell script that emulates a password being generated will impact the “entropy” of /dev/random. The below script will reduce the entropy to low numbers. This process will now impact the OS itself and any components that reference /dev/random. We can observe with “lsof /dev/random” that the java programs will still reference /dev/random; even though most activity is going to /dev/urandom.

Using the time command in the BASH shell script, we can see that the response is rapid for the first 20+ iterations, but as soon as the entropy is depleted, each execution is delayed by 10-30x times.

counter=1;MAX=100;while [ $counter -le $MAX ]; do echo "##########  $counter ##########" ; time dd if=/dev/random bs=8 count=1 2> /dev/null | base64; counter=$(( $counter + 1 )); done;

Enable RNGD on vApp r14.4 & Testing

Now let’s see what RNGD service will do for us when it is enabled. Let’s follow the steps below to unmask, enable, and start the RNGD service as the ‘config’ userID. We have access to sudo to the Centos 8 Stream command of /sbin/systemctl.

sudo /usr/bin/systemctl status rngd.service
ls -lart /etc/systemd/system/rngd.service
sudo /usr/bin/systemctl unmask rngd.service
sudo /usr/bin/systemctl enable rngd.service
cat /usr/lib/systemd/system/rngd.service
sudo /usr/bin/systemctl start rngd.service
sudo /usr/bin/systemctl status rngd.service
ps -ef | grep rngd | grep -v grep

After the RNGD service is enabled, test again with the same prior BASH shell script but bump the loops to 1000 or higher. Note using the time command we can see that each loop finishes within a fraction of a second.

counter=1;MAX=1000;while [ $counter -le $MAX ]; do echo "##########  $counter ##########" ; time dd if=/dev/random bs=8 count=1 2> /dev/null | base64; counter=$(( $counter + 1 )); done;

Summary

Aim to keep the solution footprint small and the right-sized to solve the business’ needs. Do not accept the default performance; avoid over-purchasing to scale to your expected growth.

Use the JVM switch wherever there is a java process, e.g. BLC or home-grown ETL (extract-transform-load) processes.

-Djava.security.egd=file:/dev/./urandom

If you suspect a dependence may impact the OS or other processes on /dev/random, then enable the OS RNGD and perform your testing. Monitor with the top command to ensure RNGD service is providing value and not impacting the solution.

LDAP MITM Methodology to isolate data challenge

The Symantec (CA/Broadcom) Directory solution provides a mechanism for routing LDAPv3 traffic to other solutions. This routing mechanism allows Symantec Directory to act as a virtual directory service for other directories, e.g., MS Active Directory, SunOne, Novell eDirectory, etc.


The Symantec Identity Suite solution uses the LDAP protocol for its mid-tier and connector-tier components. The Provisioning Server is exposed on TCP 20389/20390, the JCS (Java Connector Server) is exposed on TCP 20410/20411, and the CCS (C++ Connector Server) is exposed on TCP 20402/20403.


We wished to isolate provisioning data challenges within the Symantec Identity Management solution that was not fully viewable using the existing debugging logs & features of the provisioning tier & connector tiers. Using Symantec Directory, we can leverage the routing mechanism to build a MITM (man-in-the-middle) methodology to track all LDAP traffic through the Symantec Identity Manager connector tier.


We focused on the final leg of provisioning and created a process to track the JCS -> CCS LDAP traffic. We wanted to understand what and how the data was being sent from the JCS to the CCS to isolate issues to the CCS service and MS Active Directory. Using the trace level of Symantec Directory, we can capture all LDAP traffic, including binds/queries/add/modify actions.

The below steps showcase how to use Symantec Directory as an approved MITM process for troubleshooting exercises. We found this process more valuable than deploying Wireshark on the JCS/CCS Server and decoding the encrypted traffic for LDAP.

Background:

Symantec Directory documentation on routing. Please note the concept / feature of “set transparent-routing = true;” to avoid schema challenges when routing to other directory/ldap solutions.

https://techdocs.broadcom.com/us/en/symantec-security-software/identity-security/directory/14-1/ca-directory-concepts/directory-distribution-and-routing.html

MITM Methodology for JCS->CCS Service:

The Symantec Identity Management connector tier may be deployed on MS Windows or Linux OS. If the CCS service is being used, then MS Windows OS is required for this MS Visual C++ component/service. As we are focused on the CCS service, we will introduce the Symantec Directory solution on the same MS Windows OS.

NOTE: We will keep the MITM process contained on a single host, and will not redirect the network traffic beyond the host.

Step 1: Deploy the latest Symantec Directory solution on MS Windows OS. This deployment is a blank slate for the next steps to follow.

Step 2: Copy the folders of schema, limits, and ssld from an existing Symantec Directory deployment of the Symantec Identity Manager solution. Using the existing schema files, references, and certificates will allow us to avoid any challenges during startup of the Router DSA due to the pre-defined provisioning/connector tier configurations. Please note when copying from a Linux OS version of Symantec Directory, we will need to update the path from Linux format to MS Windows format in the SSLD impd.dxc file for “cert-dir” and “ca-file” parameters.

# DXserver/config/ssld/impd.dxc

set ssl = {
cert-dir = "C:\Program Files\CA\Directory\dxserver\config\ssld\personalities"
ca-file = "C:\Program Files\CA\Directory\dxserver\config\ssld\impd_trusted.pem"
cipher = "HIGH:!SSLv2:!EXP:!aNULL:!eNULL"
#protocol = tlsv12
fips = false
};

Step 3: Create a new Router DSA DXI configuration file. This is the primary configuration file for Symantec Directory DSA. It will referenced the schema, knowledge, limits, and certificates for the DSA. Note the parameters for “transparent-routing” to avoid schema challenges with other solutions. Note the trace level used to trace the LDAP traffic in the Symantec Directory Router DSA trace log.

# DXserver/config/servers/admin_router_ccs_30402.dxi

# logging and tracing 
close summary-log; 
close trace-log; 
source "../logging/default.dxc"; 
 
# schema 
clear schema; 
source "../schema/impd.dxg";
 
# access controls 
clear access; 
# source "../access/"; 
 
# ssld
source "../ssld/impd.dxc";

# knowledge 
clear dsas; 
source "../knowledge/admin_router_ccs_group.dxg"; 
 
# operational settings 
source "../settings/default.dxc"; 
 
# service limits 
source "../limits/impd.dxc"; 

# database  - none - transparent router
set transparent-routing=TRUE;

# tunnel through eAdmin server error code and  messages
set route-non-compliant-ldap-error-codes = true;

set trace=ldap,time,stats;
#set trace=dsa,time;

Step 4: Create the three (3) knowledge files. The “group” knowledge file will be used to redirect to the other two (2) knowledge files of the router DSA and the re-direct DSA to the CCS service.

# DXserver/config/knowledge/admin_router_ccs_group.dxg 
# The admin_router_ccs_30402.dxc PORT 30402 
# will be used for the IAMCS (JCS) CCS port override configuration file
# server_ccs.properties via proxyConnectionConfig.proxyServerPort=30402

source "admin_router_ccs_30402.dxc";
source "admin_ccs_server_01.dxc";
# DXserver/config/knowledge/admin_router_ccs_30402.dxc 
# This file is sourced by admin_router_ccs_group.dxg.
 
set dsa admin_router_ccs_30402 =  
{ 
    prefix        = <> 
    dsa-name      = <dc etasa><cn admin_router_ccs_30402> 
    dsa-password  = "secret"
    address       = ipv4 localhost port 30402
    snmp-port     = 22500
    console-port  = 22501
    auth-levels   = clear-password
    dsp-idle-time = 100000 
    trust-flags = allow-check-password, trust-conveyed-originator
    link-flags    = ssl-encryption-remote
};
# DXserver/config/knowledge/admin_ccs_server_01.dxc
# This file is sourced by admin_router_ccs_group.dxg.

set dsa admin_ccs_server_01 =  
{ 
     prefix        = <dc etasa> 
     dsa-name      = <dc etasa><cn admin_ccs_server_01> 
     dsa-password  = "secret"
     address       = ipv4 localhost port 20402
     auth-levels   = clear-password
     dsp-idle-time = 100000
     dsa-flags     = load-share
     trust-flags   = allow-check-password, no-server-credentials, trust-conveyed-originator
     link-flags    = dsp-ldap
     #link-flags    = dsp-ldap, ssl-encryption
     # Note:  ssl will require update to /etc/hosts with:  <IP_Address>  eta_server

};

Step 5: Update the JCS configuration file that contains the TCP port that we will be redirecting to. In this example, we will declare TCP 30402 to be the new port.

#C:\Program Files (x86)\CA\Identity Manager\Connector Server\jcs\conf\override\server_ccs.properties

ccsWindowsController.ccsScriptPath=C:\\Program Files (x86)\\CA\\Identity Manager\\Connector Server\\ccs\\bin
proxyCCSManager.enabled=true
proxyCCSManager.startupWait=30
proxyConnectionConfig.proxyServerHostname=localhost
#proxyConnectionConfig.proxyServerPort=20402
proxyConnectionConfig.proxyServerPort=30402
proxyConnectionConfig.proxyServerUser=cn=root,dc=etasa
proxyConnectionConfig.proxyServerPassword={AES}pbj27RvWGakDKCr+DhRH4Q==
proxyConnectionConfig.proxyServerUseSsl=false
proxyCCSManager.controller.ref=ccsWindowsController

Overview of all files updated and their relationship to each other.

Validation

Start up the solution in the following order. Ensure that the new Symantec Directory Router DSA is starting with no issue. If there are any syntax issues, isolate them with the command: dxserver -d start DSA_NAME.

Start the Router DSA first, then restart the im_jcs (JCS) service. The im_ccs (CCS) service will be auto-started by the JCS service. Wait one (1) minute, then check that both TCP Ports 20402 (CCS) and 30402 (Router DSA) are both in the LISTEN state. If we do not see these both ports, please stop and restart these services.

May use MS Sysinternals ProcessExplorer to monitor both services and using the TCP/IP tab, to view which ports are being used.

A view of the im_ccs.exe and dxserver.exe services and which TCP ports they are listening on.

Use a 3rd party LDAP client tool, such as Jxplorer to authenticate to both the CCS and the Router DSA ports, with the embedded service ID of “cn=root,dc=etasa”. We should see exactly the SAME data.

Use the IME or IMPS to perform a query to MS Active Directory (or any other endpoint that uses the CCS connector tier). We should now see the “cache” on the CCS service be populated with the endpoint information, and the base DN structure. We can now track all LDAP traffic through the Router DSA MITM process.

View of trace logs

We can monitor when the JCS first binds to the CCS service.

We can monitor when the IMPS via the JCS queries if the CCS is aware of the ADS endpoint.

Finally, we can view when the IMPS service decrypt its stored information on the Active Directory endpoint, and push this information to the CCS cache, to allow communication to MS Active Directory. Using Notepad++ we can tail the trace log.

Please note, this is a secure LDAP/S tunnel from the IMPS -> JCS -> CCS -> MS ADS.

We can now view how this data is pushed via this secure tunnel with the MITM process.

> [88] 
> [88] <-- #1 LDAP MESSAGE messageID 5
> [88] AddRequest
> [88]  entry: eTADSDirectoryName=ads2016,eTNamespaceName=ActiveDirectory,dc=im,dc=etasa
> [88]  attributes:
> [88]   type: eTADSobjectCategory
> [88]   value: CN=Domain-DNS,CN=Schema,CN=Configuration,DC=exchange,DC=lab
> [88]   type: eTADSdomainFunctionality
> [88]   value: 7
> [88]   type: eTADSUseSSL
> [88]   value: 3
> [88]   type: eTADSexchangeGroups
> [88]   value: CN=Mailbox Database 0840997559,CN=Databases,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=ExchangeLab,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=exchange,DC=lab
> [88]   value: CN=im,CN=Databases,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=ExchangeLab,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=exchange,DC=lab
> [88]   type: eTLogWindowsEventSeverity
> [88]   value: FE
> [88]   type: eTAccountResumable
> [88]   value: 1
> [88]   type: eTADSnetBIOS
> [88]   value: EXCHANGE
> [88]   type: eTLogStdoutSeverity
> [88]   value: FE
> [88]   type: eTLog
> [88]   value: 0
> [88]   type: eTLogUnicenterSeverity
> [88]   value: FE
> [88]   type: eTADSlockoutDuration
> [88]   value: -18000000000
> [88]   type: objectClass
> [88]   value: eTADSDirectory
> [88]   type: eTLogETSeverity
> [88]   value: FE
> [88]   type: eTADSmsExchSystemObjectsObjectVersion
> [88]   value: 13240
> [88]   type: eTADSsettings
> [88]   value: 3
> [88]   type: eTADSconfig
> [88]   value: ExpirePwd=0
> [88]   value: HomeDirInheritPermission=0
> [88]   type: eTLogDestination
> [88]   value: F
> [88]   type: eTADSUserContainer
> [88]   value: CN=BuiltIn;CN=Users
> [88]   type: eTADSbackupDirs
> [88]   value: 000;DEFAULT;192.168.242.156;0
> [88]   value: 001;DEFAULT;dc2016.exchange.lab;0
> [88]   value: 002;site1;server1.domain.com;0
> [88]   value: 003;site1;server2.domain.com;0
> [88]   value: 004;site2;server3.domain.com;0
> [88]   value: 005;site2;server4.domain.com;0
> [88]   type: eTADSuseFailover
> [88]   value: 1
> [88]   type: eTLogAuditSeverity
> [88]   value: FE
> [88]   type: eTADS-DefaultContext
> [88]   value: exchange.lab
> [88]   type: eTADSforestFunctionality
> [88]   value: 7
> [88]   type: eTADSAuthDN
> [88]   value: Administrator
> [88]   type: eTADSlyncMaxConnection
> [88]   value: 5
> [88]   type: eTADShomeMTA
> [88]   value: CN=Microsoft MTA,CN=EXCHANGE2016,CN=Servers,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=ExchangeLab,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=exchange,DC=lab
> [88]   type: eTADSAuthPWD
> [88]   value: CAdemo123
> [88]   type: eTADSexchangelegacyDN
> [88]   value: /o=ExchangeLab/ou=Exchange Administrative Group (FYDIBOHF23SPDLT)/cn=Configuration/cn=Servers/cn=EXCHANGE2016/cn=Microsoft Private MDB
> [88]   type: eTLogFileSeverity
> [88]   value: F
> [88]   type: eTADSprimaryServer
> [88]   value: dc2016.exchange.lab
> [88]   type: eTADScontainers
> [88]   value: CN=Builtin,DC=exchange,DC=lab
> [88]   value: CN=Computers,DC=exchange,DC=lab
> [88]   value: OU=Domain Controllers,DC=exchange,DC=lab
> [88]   value: OU=Explore,DC=exchange,DC=lab
> [88]   value: CN=ForeignSecurityPrincipals,DC=exchange,DC=lab
> [88]   value: CN=Keys,DC=exchange,DC=lab
> [88]   value: CN=Managed Service Accounts,DC=exchange,DC=lab
> [88]   value: OU=Microsoft Exchange Security Groups,DC=exchange,DC=lab
> [88]   value: OU=o365,DC=exchange,DC=lab
> [88]   value: OU=People,DC=exchange,DC=lab
> [88]   value: CN=Program Data,DC=exchange,DC=lab
> [88]   value: CN=Users,DC=exchange,DC=lab
> [88]   value: DC=ForestDnsZones,DC=exchange,DC=lab
> [88]   value: DC=DomainDnsZones,DC=exchange,DC=lab
> [88]   type: eTADSTimeBoundMembershipsEnabled
> [88]   value: 0
> [88]   type: eTADSexchange
> [88]   value: 1
> [88]   type: eTADSdomainControllerFunctionality
> [88]   value: 7
> [88]   type: eTADSexchangeStores
> [88]   value: CN=EXCHANGE2016,CN=Servers,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=ExchangeLab,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=exchange,DC=lab
> [88]   value: CN=Mailbox,CN=Transport Configuration,CN=EXCHANGE2016,CN=Servers,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=ExchangeLab,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=exchange,DC=lab
> [88]   value: CN=Frontend,CN=Transport Configuration,CN=EXCHANGE2016,CN=Servers,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=ExchangeLab,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=exchange,DC=lab
> [88]   type: eTADSKeepCamCaftFiles
> [88]   value: 0
> [88]   type: eTADSmsExchSchemaVersion
> [88]   value: 15333
> [88]   type: eTADSCamCaftTimeout
> [88]   value: 0000001800
> [88]   type: eTADSMaxConnectionsInPool
> [88]   value: 0000000101
> [88]   type: eTADSPortNum
> [88]   value: 389
> [88]   type: eTADSDCDomain
> [88]   value: DC=exchange,DC=lab
> [88]   type: eTADSServerName
> [88]   value: 192.168.242.156
> [88]   type: eTADSDirectoryName
> [88]   value: ads2016
> [88]   type: eTAccountDeletable
> [88]   value: 1
> [88] controls:
> [88]   controlType: 2.16.840.1.113730.3.4.2
> [88]   non-critical

We can now monitor all traffic and assist with troubleshooting any CCS/MS-ADS challenges.

This same MITM methodology/process may also be used for the IMPS (TCP 20389/2039) and the JCS (TCP 20410/20411) services. We have used this process to capture the IME (JIAM) LDAP traffic to the IMPS Service, to isolate multiple queries for Child Provisioning Roles. Which has been used by the product team to enhance the solution to lower startup durations of the IME in the latest releases.

Binds/queries/add/modification all work with this approach, but we do see an issue with OID for IMPS ADS endpoint “explore process” on ADS OU object. We are reviewing how to address this last challenge that states “critical extension is unavailable” for a LDAP control property of the OU object. The OIDs captured appear to be related to SunOne/Iplanet.

Disaster Recovery Scenarios for Directories

Restore processes may be done with snapshots-in-time for both databases and directories. We wished to provide clarity of the restoration steps after a snapshot-in-time is utilized for a directory. The methodology outlined below has the following goals: a) allow sites to prepare before they need the restoration steps, b) provide a training module to exercise samples included in a vendor solution.

In this scenario, we focused on the CA/Broadcom/Symantec Directory solution. The CA Directory provides several tools to automate online backup snapshots, but these processes stop at copies of the binary data files.

Additionally, we desired to walk-through the provided DAR (Disaster and Recovery) scenarios and determine what needed to be updated to reflect newer features; and how we may validate that we did accomplish a full restoration.

Finally, to assist with the decision tree model, where we need to triage and determine if a full restore is required, or may we select partial restoration via extracts and imports of selected data.

Cluster Out-of-Sync Scenario

Awareness

The first indicator that a userstore (CA Directory DATA DSA) is out-of-sync will be the CA Directory logs themselves, e.g. alarm or trace logs.

Another indication will be inconsistent query results for a user object that returns different results when using a front-end router to the DATA DSAs.

After awareness of the issue, the team will exercise a triage process to determine the extent of the out-of-sync data. For a quick check, one may execute LDAP queries direct to the TCP port of each DATA DSA on each host, and examine the results directory or even the total number of entries, e.g. dxTotalEntryCount.

The returned count value will help determine if the number of entries for each DATA DSA on the peer MW hosts are out-of-sync for ADD or DEL operations. The challenge/GAP with this method is it will not show any delta due to modify operations on the user objects themselves, e.g. address field changed.

Example of LDAP queries (dxsearch/ldapsearch) to CA Directory DATA DSA for the CA Identity Management solution (4 DATA DSA and 1 ROUTER DSA)

su - dsa    OR [ sudo -iu dsa ]
echo -n Password01 > .impd.pwd ; chmod 600 .impd.pwd

# NOTIFY BRANCH (TCP 20404) 
LDAPTLS_REQCERT=never  dxsearch -LLL -H ldaps://`hostname`:20404 -D 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' -y .impd.pwd -s sub -b 'dc=notify,dc=etadb' '(objectClass=*)' dxTotalEntryCount
dn: dc=notify,dc=etadb

# INC BRANCH (TCP 20398)
LDAPTLS_REQCERT=never  dxsearch -LLL -H ldaps://`hostname`:20398 -D 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' -y .impd.pwd -s sub -b 'eTInclusionContainerName=Inclusions,eTNamespaceName=CommonObjects,dc=im,dc=etadb' '(objectClass=*)' dxTotalEntryCount

# CO BRANCH (TCP 20396)
LDAPTLS_REQCERT=never  dxsearch -LLL -H ldaps://`hostname`:20396 -D 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' -y .impd.pwd -s sub -b 'eTNamespaceName=CommonObjects,dc=im,dc=etadb' '(objectClass=*)' dxTotalEntryCount

# MAIN BRANCH (TCP 20394)
LDAPTLS_REQCERT=never  dxsearch -LLL -H ldaps://`hostname`:20394 -D 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' -y .impd.pwd -s sub -b 'dc=im,dc=etadb' '(objectClass=*)' dxTotalEntryCount

# ALL BRANCHES - Router Port (TCP 20391)
LDAPTLS_REQCERT=never  dxsearch -LLL -H ldaps://`hostname`:20391 -D 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' -y .impd.pwd -s sub -b 'dc=etadb' '(objectClass=*)' dxTotalEntryCount

# Scroll to see entire line 

A better process to identify the delta(s) will be automating the daily backup process, to build out LDIF files for each peer MW DATA DSA and then performing a delta process between the LDIF files. We will walk through this more involve step later in this blog entry.

Recovery Processes

The below link has examples from CA/Broadcom/Symantec with recovery notes of CA Directory DATA DSA that are out-of-sync due to extended downtime or outage window.

The below image pulled from the document (page 9.) shows CA Directory r12.x using the latest recovery processes of “multiwrite-DISP” (MW-DISP) mode.

This recovery process of MW-DISP is default for the CA Identity Management DATA DSAs via the install wizard tools, when they create the IMPD DATA DSAs.

https://knowledge.broadcom.com/external/article?articleId=54088

The above document is dated, and still mentions additional file structures that have been retired, e.g. oc/zoc, at,zat.

An enhancement request has been submitted for both of these requests:

https://community.broadcom.com/participate/ideation-home/viewidea?IdeationKey=c71a304b-a689-4894-ac1c-786c9a2b2d0d

The modified version we have started for CA Directory r14.x adds some clarity to the <dsaname>.dx files; and which steps may be adjusted to support the split data structure for the four (4) IMPD DATA DSAs.

The same time flow diagram was used. Extra notes were added for clarity, and if possible, examples of commands that will be used to assist with direct automation of each step (or maybe pasted in an SSH session window, as the dsa service ID).

Step 1, implicit in the identification/triage process, is to determine what userstore data is out-of-sync and how large a delta do we have. If the DSA service has been shut down (either deliberately or via a startup issue), if the shutdown delay is more than a few days, then the CA Directory process will check the date stamp in the <dsaname>.dp file and the transaction in the <dsaname>.tx file; if the dates are too large CA Directory will refuse to start the DATA DSA and issue a warning message.

Step 2, we will leverage the dxdisp <dsaname> command to generate a new time-stamp file <dsaname>.dx, that will be used to prevent unnecessary sync operations with any data older than the date stamp in this file. 

This command should be issued for every DATA DSA on the same host—Especially true for split DATA DSAs, e.g. IMPD (CA Identity Manager’s Provisioning Directories). In our example below, to assist with this step, we use a combination of commands with a while-loop to issue the dxdisp command.

This command can be executed regardless if the DSA is running or shutdown. If an existing <dsaname>.dx file exists, any additional execution of dxdisp will add updated time-stamps to this file.  

Note: The <dsaname>.dx file will be removed upon restart of the DATA DSA.

STEP 2: ISSUE DXDISP COMMAND [ Create time-stamp file for re-sync use ] ON ALL IMPD SERVERS.

su - dsa OR [ sudo -iu dsa ]
bash
dxserver status | grep -v router | awk '{print $1}' | while IFS='' read -r LINE || [ -n "$LINE" ] ; do dxdisp "$LINE" ;done ; echo ; find $DXHOME -name "*.dx" -exec ls -larth {} \;

# Scroll to see entire line 

Step 3 will then ask for an updated online backup to be executed. 

In earlier release of CA Directory, this required a telnet/ssh connection to the dxconsole of each DATA DSA. Or using the DSA configuration files to contain a dump dxgrid-db; command that would be executed with dxserver init all command. 

In newer releases of CA Directory, we can leverage the dxserver onlinebackup <dsaname> process. 

This step can be a challenge to dump all DATA DSAs at the same time, using manual procedures. 

Fortunately, we can automate this with a single bash shell process; and as an enhancement, we can also generate the LDIF extracts of each DATA DSA for later delta compare operations.

Note: The DATA DSA must be running (started) for the onlinebackup process to function correctly. If unsure, issue a dxserver status or dxserver start all prior. 

Retain the LDIF files from the “BAD” DATA DSA Servers for analysis.

STEP 3a-3c: ON ALL IMPD DATA DSA SERVERS - ISSUE ONLINE BACKUP PROCESS
su - dsa OR [ sudo -iu dsa ]
bash

dxserver status | grep started | grep -v router | awk '{print $1}' | while IFS='' read -r LINE || [ -n "$LINE" ] ; do dxserver onlinebackup "$LINE" ; sleep 10; dxdumpdb -w -z -f /tmp/`date '+%Y%m%d_%H%M%S_%s'`_$LINE.ldif $LINE ;done ; echo ; find $DXHOME -name "*.zdb" -exec ls -larth {} \; ; echo ; ls -larth --time-style=full-iso /tmp/*.ldif | grep  `date '+%Y-%m-%d'`

# Scroll to see entire line 

Step 4a Walks through the possible copy operations from “GOOD” to the “BAD” DATA DSA host, for the <dsaname>.zdb files. The IMPD DATA DSA will require that three (3) of four (4) zdb files are copied, to ensure no impact to referential integrity between the DATA DSA.

The preferred model to copy data from one remote host to another is via the compressed rsync process over SSH, as this is a rapid process for the CA Directory db / zdb files.

https://anapartner.com/2020/05/03/wan-latency-rsync-versus-scp/

Below are the code blocks that demonstrate examples how to copy data from one DSA server to another DSA server.

# RSYNC METHOD
sudo -iu dsa

time rsync --progress -e 'ssh -ax' -avz --exclude "User*" --exclude "*.dp" --exclude "*.tx" dsa@192.168.242.135:./data/ $DXHOME/data

# Scroll to see entire line 
# SCP METHOD   
sudo -iu dsa

scp   REMOTE_ID@$HOST:./data/<folder_impd_data_dsa_name>/*.zdb   /tmp/dsa_data
/usr/bin/mv  /tmp/dsa_data/<incorrect_dsaname>.zdb   $DXHOME/data/<folder_impd_data_dsa_name>/<correct_dsaname>.db

# Scroll to see entire line 

Step 4b Walk through the final steps before restarting the “BAD” DATA DSA.

The ONLY files that should be in the data folders are <dsaname>.db (binary data file) and <dsaname>.dx (ASCII time-stamp file). Ensure that the copied <prior-hostname-dsaname>.zdb file has been renamed to the correct hostname & extension for <dsaname>.db

Remove the prior <dsaname>.dp (ASCII time-stamp file) { the DATA DSA will auto replace this file with the *.dx file contents } and the <dsaname>.tx (binary data transaction file).

Step 5a Startup the DATA DSA with the command

dxserver start all

If there is any issue with a DATA or ROUTER DSA not starting, then issue the same command with the debug switch (-d)

dxserver -d start <dsaname>

Use the output from the above debug process to address any a) syntax challenges, or b) older PID/LCK files ($DXHOME/pid)

Step 5b Finally, use dxsearch/ldapsearch to query a unit-test of authentication with the primary service ID. Use other unit/use-case tests as needed to confirm data is now synced.

bash
echo -n Password01 > .impd.pwd ; chmod 600 .impd.pwd

LDAPTLS_REQCERT=never dxsearch -LLL -H ldaps://`hostname`:20394 -D 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' -y .impd.pwd -s base -b 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' '(objectClass=*)' | perl -p00e 's/\r?\n //g'

# Scroll to see entire line 

LDIF Recovery Processes

The steps above are for recovery via a 100% replacement method, where the assumption is that the “bad” DSA server does NOT have any data worth keeping or wish to be reviewed.

We wish to clarify a process/methodology, where the “peer” Multi-write DSA may be out-of-sync. Still, we are not sure “which” is truly the “good DSA” to select, or perhaps we wished to merge data from multiple DSA before we declare one to be the “good DSA” (with regards to the completeness of data).

Using CA Directory commands, we can join them together to automate snapshots and exports to LDIF files. These LDIF files can then be compared against their peers MW DATA DSA exports or even to themselves at different snapshot export times. As long as we have the LDIF exports, we can recover from any DAR scenario.

Example of using CA Directory dxserver and dxdumpdb commands (STEP 3) with the ldifdelta and dxmodify commands.

The output from ldifdelta may be imported to any remote peer MW DATA DSA server to sync via dxmodify to that hostname, to force a sync for the few objects that may be out-of-sync, e.g. Password Hashes or other.

dxserver status | grep started | grep -v router | awk '{print $1}' | while IFS='' read -r LINE || [ -n "$LINE" ] ; do dxserver onlinebackup "$LINE" ; sleep 10; dxdumpdb -z -f /tmp/`date '+%Y%m%d_%H%M%S_%s'`_$LINE.ldif $LINE ;done ; echo ; find $DXHOME -name "*.zdb" -exec ls -larth {} \; ; echo ; ls -larth --time-style=full-iso /tmp/*.ldif | grep  `date '+%Y-%m-%d'`

ldifdelta -x -S ca-prov-srv-01-impd-co  /tmp/20200819_122820_1597858100_ca-prov-srv-01-impd-co.ldif   /tmp/20200819_123108_1597858268_ca-prov-srv-01-impd-co.ldif  |  perl -p00e 's/\r?\n //g'  >   /tmp/delta_file_ca-prov-srv-01-impd-co.ldif   ; cat /tmp/delta_file_ca-prov-srv-01-impd-co.ldif

echo -n Password01 > .impd.pwd ; chmod 600 .impd.pwd
dxmodify -v -c -h`hostname` -p 20391  -D 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' -y .impd.pwd -f /tmp/delta_file_ca-prov-srv-01-impd-co.ldif

# Scroll to see entire line 

The below images demonstrate a delta that exists between two (2) time snapshots. The CA Directory tool, ldifdelta, can identify and extract the modified entry to the user object.

The following examples will show how to re-import this delta using dxmodify command to the DATA DSA with no other modifications required to the input LDIF file.

In the testing example below, before any update to an object, let’s capture a snapshot-in-time and the LDIF files for each DATA DSA.

Lets make an update to a user object using any tool we wish, or command line process like ldapmodify.

Next, lets capture a new snapshot-in-time after the update, so we will be able to utilize the ldifdelta tool.

We can use the ldifdelta tool to create the delta LDIF input file. After we review this file, and accept the changes, we can then submit this LDIF file to the remote peer MW DATA DSA that are out-of-sync.

Hope this has value to you and any challenges you may have with your environment.

Avoid locking a userID in a Virtual Appliance

The below post describes enabling the .ssh private key/public key process for the provided service IDs to avoid dependency on a password that may be forgotten, and also how to leverage the service IDs to address potential CA Directory data sync challenges that may occur when there are WAN network latency challenges between remote cluster nodes.

Background:

The CA/Broadcom/Symantec Identity Suite (IGA) solution provides for a software virtual appliance. This software appliance is available on Amazon AWS as a pre-built AMI image that allows for rapid deployment.

The software appliance is also offered as an OVA file for Vmware ESXi/Workstation deployment.

Challenge:

If the primary service ID is locked or password is allowed to expire, then the administrator will likely have only two (2) options:

1) Request assistance from the Vendor (for a supported process to reset the service ID – likely with a 2nd service ID “recoverip”)

2) Boot from an ISO image (if allowed) to mount the vApp as a data drive and update the primary service ID.

Proposal:

Add a standardized SSH RSA private/pubic key to the primary service ID, if it does not exist. If it exists, validate able to authentication and copy files between cluster nodes with the existing .SSH files. Rotate these files per internal security policies, e.g. 1/year.

The focus for this entry is on the CA ‘config’ and ‘ec2-user’ service IDs.

An enhancement request has been added, to have the ‘dsa’ userID added to the file’/etc/ssh/ssh_allowed_users’ to allow for the same .ssh RSA process to address challenges during deployments where the CA Directory Data DSA did not fully copy from one node to another node.

https://community.broadcom.com/participate/ideation-home/viewidea?IdeationKey=7c795c51-d028-4db8-adb1-c9df2dc48bff

AWS vApp: ‘ec2-user’

The primary service ID for remote SSH access is ‘ec2-user’ for the Amazon AWS is already deployed with a .ssh RSA private/public key. This is a requirement for AWS deployments and has been enabled to use this process.

This feature allows for access to be via the private key from a remote SSH session using Putty/MobaXterm or similar tools. Another feature may be leveraged by updating the ‘ec2-user’ .ssh folder to allow for other nodes to be exposed with this service ID, to assist with the deployment of patch files.

As an example, enabling .ssh service between multiple cluster nodes will reduce scp process from remote workstations. Prior, if there were five (5) vApp nodes, to patch them would require uploading the patch direct to each of the five (5) nodes. With enabling .ssh service between all nodes for the ‘ec2-user’ service ID, we only need to upload patches to one (1) node, then use a scp process to push these patch file(s) from one node to another cluster node.

On-Prem vApp: ‘config’

We wish to emulate this process for on-prem vApp servers to reduce I/O for any files to be uploaded and/or shared.

This process has strong value when CA Directory *.db files are out-of-sync or during initial deployment, there may be network issues and/or WAN latency.

Below is an example to create and/or rotate the private/public SSH RSA files for the ‘config’ service ID.

An example to create and/or rotate the private/public SSH RSA files for the ‘config’ service ID.

Below is an example to push the newly created SSH RSA files to the remote host(s) of the vApp cluster. After this step, we can now use scp processes to assist with remediation efforts within scripts without a password stored as clear text.

Copy the RSA folder to your workstation, to add to your Putty/MobaXterm or similar SSH tool, to allow remote authentication using the public key.

If you have any issues, use the embedded verbose logging within the ssh client tool (-vv) to identify the root issue.

ssh -vv userid@remote_hostname

Example:

config@vapp0001 VAPP-14.1.0 (192.168.242.146):~ > eval `ssh-agent` && ssh-add
Agent pid 5717
Enter passphrase for /home/config/.ssh/id_rsa:
Identity added: /home/config/.ssh/id_rsa (/home/config/.ssh/id_rsa)
config@vapp0001 VAPP-14.1.0 (192.168.242.146):~ >
config@vapp0001 VAPP-14.1.0 (192.168.242.146):~ > ssh -vv config@192.168.242.128
OpenSSH_5.3p1, OpenSSL 1.0.1e-fips 11 Feb 2013
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Applying options for *
debug2: ssh_connect: needpriv 0
debug1: Connecting to 192.168.242.128 [192.168.242.128] port 22.
debug1: Connection established.
debug1: identity file /home/config/.ssh/identity type -1
debug1: identity file /home/config/.ssh/identity-cert type -1
debug2: key_type_from_name: unknown key type '-----BEGIN'
debug2: key_type_from_name: unknown key type 'Proc-Type:'
debug2: key_type_from_name: unknown key type 'DEK-Info:'
debug2: key_type_from_name: unknown key type '-----END'
debug1: identity file /home/config/.ssh/id_rsa type 1
debug1: identity file /home/config/.ssh/id_rsa-cert type -1
debug1: identity file /home/config/.ssh/id_dsa type -1
debug1: identity file /home/config/.ssh/id_dsa-cert type -1
debug1: identity file /home/config/.ssh/id_ecdsa type -1
debug1: identity file /home/config/.ssh/id_ecdsa-cert type -1
debug1: Remote protocol version 2.0, remote software version OpenSSH_5.3
debug1: match: OpenSSH_5.3 pat OpenSSH*
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_5.3
debug2: fd 3 setting O_NONBLOCK
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug2: kex_parse_kexinit: diffie-hellman-group-exchange-sha256,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1
debug2: kex_parse_kexinit: ssh-rsa-cert-v01@openssh.com,ssh-dss-cert-v01@openssh.com,ssh-rsa-cert-v00@openssh.com,ssh-dss-cert-v00@openssh.com,ssh-rsa,ssh-dss
debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,rijndael-cbc@lysator.liu.se
debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,rijndael-cbc@lysator.liu.se
debug2: kex_parse_kexinit: hmac-sha1,umac-64@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-ripemd160,hmac-ripemd160@openssh.com,hmac-sha1-96
debug2: kex_parse_kexinit: hmac-sha1,umac-64@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-ripemd160,hmac-ripemd160@openssh.com,hmac-sha1-96
debug2: kex_parse_kexinit: none,zlib@openssh.com,zlib
debug2: kex_parse_kexinit: none,zlib@openssh.com,zlib
debug2: kex_parse_kexinit:
debug2: kex_parse_kexinit:
debug2: kex_parse_kexinit: first_kex_follows 0
debug2: kex_parse_kexinit: reserved 0
debug2: kex_parse_kexinit: diffie-hellman-group-exchange-sha256,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1
debug2: kex_parse_kexinit: ssh-rsa,ssh-dss
debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc,aes192-cbc,aes256-cbc,rijndael-cbc@lysator.liu.se
debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc,aes192-cbc,aes256-cbc,rijndael-cbc@lysator.liu.se
debug2: kex_parse_kexinit: hmac-sha1,hmac-sha2-256,hmac-sha2-512
debug2: kex_parse_kexinit: hmac-sha1,hmac-sha2-256,hmac-sha2-512
debug2: kex_parse_kexinit: none
debug2: kex_parse_kexinit: none
debug2: kex_parse_kexinit:
debug2: kex_parse_kexinit:
debug2: kex_parse_kexinit: first_kex_follows 0
debug2: kex_parse_kexinit: reserved 0
debug2: mac_setup: found hmac-sha1
debug1: kex: server->client aes128-ctr hmac-sha1 none
debug2: mac_setup: found hmac-sha1
debug1: kex: client->server aes128-ctr hmac-sha1 none
debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<2048<8192) sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP
debug2: dh_gen_key: priv key bits set: 141/320
debug2: bits set: 1027/2048
debug1: SSH2_MSG_KEX_DH_GEX_INIT sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY
debug1: Host '192.168.242.128' is known and matches the RSA host key.
debug1: Found key in /home/config/.ssh/known_hosts:2
debug2: bits set: 991/2048
debug1: ssh_rsa_verify: signature correct
debug2: kex_derive_keys
debug2: set_newkeys: mode 1
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug2: set_newkeys: mode 0
debug1: SSH2_MSG_NEWKEYS received
debug1: SSH2_MSG_SERVICE_REQUEST sent
debug2: service_accept: ssh-userauth
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug2: key: /home/config/.ssh/id_rsa (0x5648110d2a00)
debug2: key: /home/config/.ssh/identity ((nil))
debug2: key: /home/config/.ssh/id_dsa ((nil))
debug2: key: /home/config/.ssh/id_ecdsa ((nil))
debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic,password
debug1: Next authentication method: gssapi-keyex
debug1: No valid Key exchange context
debug2: we did not send a packet, disable method
debug1: Next authentication method: gssapi-with-mic
debug1: Unspecified GSS failure.  Minor code may provide more information
Improper format of Kerberos configuration file

debug1: Unspecified GSS failure.  Minor code may provide more information
Improper format of Kerberos configuration file

debug2: we did not send a packet, disable method
debug1: Next authentication method: publickey
debug1: Offering public key: /home/config/.ssh/id_rsa
debug2: we sent a publickey packet, wait for reply
debug1: Server accepts key: pkalg ssh-rsa blen 533
debug2: input_userauth_pk_ok: SHA1 fp 39:06:95:0d:13:4b:9a:29:0b:28:b6:bd:3d:b0:03:e8:3c:ad:50:6f
debug1: Authentication succeeded (publickey).
debug1: channel 0: new [client-session]
debug2: channel 0: send open
debug1: Requesting no-more-sessions@openssh.com
debug1: Entering interactive session.
debug2: callback start
debug2: client_session2_setup: id 0
debug2: channel 0: request pty-req confirm 1
debug1: Sending environment.
debug1: Sending env LANG = en_US.UTF-8
debug2: channel 0: request env confirm 0
debug2: channel 0: request shell confirm 1
debug2: fd 3 setting TCP_NODELAY
debug2: callback done
debug2: channel 0: open confirm rwindow 0 rmax 32768
debug2: channel_input_status_confirm: type 99 id 0
debug2: PTY allocation request accepted on channel 0
debug2: channel 0: rcvd adjust 2097152
debug2: channel_input_status_confirm: type 99 id 0
debug2: shell request accepted on channel 0
Last login: Thu Apr 30 20:21:48 2020 from 192.168.242.146

CA Identity Suite Virtual Appliance version 14.3.0 - SANDBOX mode
FIPS enabled:                   true
Server IP addresses:            192.168.242.128
Enabled services:
Identity Portal               192.168.242.128 [OK] WildFly (Portal) is running (pid 10570), port 8081
                                              [OK] Identity Portal Admin UI is available
                                              [OK] Identity Portal User Console is available
                                              [OK] Java heap size used by Identity Portal: 810MB/1512MB (53%)
Oracle Database Express 11g   192.168.242.128 [OK] Oracle Express Edition started
Identity Governance           192.168.242.128 [OK] WildFly (IG) is running (pid 8050), port 8082
                                              [OK] IG is running
                                              [OK] Java heap size used by Identity Governance: 807MB/1512MB (53%)
Identity Manager              192.168.242.128 [OK] WildFly (IDM) is running (pid 5550), port 8080
                                              [OK] IDM environment is started
                                              [OK] idm-userstore-router-caim-srv-01 started
                                              [OK] Java heap size used by Identity Manager: 1649MB/4096MB (40%)
Provisioning Server           192.168.242.128 [OK] im_ps is running
                                              [OK] co file usage: 1MB/250MB (0%)
                                              [OK] inc file usage: 1MB/250MB (0%)
                                              [OK] main file usage: 9MB/250MB (3%)
                                              [OK] notify file usage: 1MB/250MB (0%)
                                              [OK] All DSAs are started
Connector Server              192.168.242.128 [OK] jcs is running
User Store                    192.168.242.128 [OK] STATS: number of objects in cache: 5
                                              [OK] file usage: 1MB/200MB (0%)
                                              [OK] UserStore_userstore-01 started
Central Log Server            192.168.242.128 [OK] rsyslogd (pid  1670) is running...
=== LAST UPDATED: Fri May  1 12:15:05 CDT 2020 ====
*** [WARN] Volume / has 13% Free space (6.2G out of 47G)
config@cluster01 VAPP-14.3.0 (192.168.242.128):~ >

A view into rotating the SSH RSA keys for the CONFIG UserID

# CONFIG - On local vApp host
ls -lart .ssh     [view any prior files]
echo y | ssh-keygen -b 4096 -N Password01 -C $USER -f $HOME/.ssh/id_rsa
IP=192.168.242.135;ssh-keyscan -p 22 $IP >> .ssh/known_hosts
IP=192.168.242.136;ssh-keyscan -p 22 $IP >> .ssh/known_hosts
IP=192.168.242.137;ssh-keyscan -p 22 $IP >> .ssh/known_hosts
cp -r -p .ssh/id_rsa.pub .ssh/authorized_keys
rm -rf /tmp/*.$USER.ssh-keys.tar
tar -cvf /tmp/`/bin/date -u +%s`.$USER.ssh-keys.tar .ssh
ls -lart /tmp/*.$USER.ssh-keys.tar
eval `ssh-agent` && ssh-add           [Enter Password for SSH RSA Private Key]
IP=192.168.242.136;scp `ls /tmp/*.$USER.ssh-keys.tar`  config@$IP:
IP=192.168.242.137;scp `ls /tmp/*.$USER.ssh-keys.tar`  config@$IP:
USER=config;ssh -tt $USER@192.168.242.136 "tar -xvf *.$USER.ssh-keys.tar"
USER=config;ssh -tt $USER@192.168.242.137 "tar -xvf *.$USER.ssh-keys.tar"
IP=192.168.242.136;ssh $IP `/bin/date -u +%s`
IP=192.168.242.137;ssh $IP `/bin/date -u +%s`
IP=192.168.242.136;ssh -vv $IP              [Use -vv to troubleshoot ssh process]
IP=192.168.242.137;ssh -vv $IP 				[Use -vv to troubleshoot ssh process]

Be safe and automate your backups for CA Directory Data DSAs to LDIF

The CA Directory solution provides a mechanism to automate daily on-line backups, via one simple parameter:

dump dxgrid-db period 0 86400;

Where the first number is the offset from GMT/UTC (in seconds) and the second number is how often to run the backup (in seconds), e.g. Once a day = 86400 sec = 24 hr x 60 min/hr x 60 sec/min

Two Gaps/Challenge(s):

History: The automated backup process will overwrite the existing offline file(s) (*.zdb) for the Data DSA. Any requirement or need to perform a RCA is lost due to this fact. What was the data like 10 days ago? With the current state process, only the CA Directory or IM logs would be of assistance.

Size: The automated backup will create an offline file (*.zdb) footprint of the same size as the data (*.db) file. If your Data DSA (*.db) is 10 GB, then your offline (*.zdb) will be 10 GB. The Identity Provisioning User store has four (4) Data DSAs, that would multiple this number , e.g. four (4) db files + four (4) offline zdb files at 10 GB each, will require minimal of 80 GB disk space free. If we attempt to retain a history of these files for fourteen (14) days, this would be four (4) db + fourteen (14) zdb = eighteen (18) x 10 GB = 180 GB disk space required.

Resolutions:

Leverage the CA Directory tool (dxdumpdb) to convert from the binary data (*.db/*.zdb) to LDIF and the OS crontab for the ‘dsa’ account to automate a post ‘online backup’ export and conversion process.

Step 1: Validate the ‘dsa’ user ID has access to crontab (to avoid using root for this effort). cat /etc/cron.allow

If access is missing, append the ‘dsa’ user ID to this file.

Step 2: Validate that online backup process have been scheduled for your Data DSA. Use a find command to identify the offline files (*.zdb ). Note the size of the offline Data DSA files (*.zdb).

Step 3: Identify the online backup process start time, as defined in the Data DSA settings DXC file or perhaps DXI file. Convert this GMT offset time to the local time on the CA Directory server. (See references to assist)

Step 4: Use crontab -e as ‘dsa’ user ID, to create a new entry: (may use crontab -l to view any entries). Use the dxdumpdb -z switch with the DSA_NAME to create the exported LDIF file. Redirect this output to gzip to automatically bypass any need for temporary files. Note: Crontab has limited variable expansion, and any % characters must be escaped.

Example of the crontab for ‘dsa’ to run 30 minutes after (at 2 am CST) the online backup process is scheduled (at 1:30 am CST).

# Goal:  Export and compress the daily DSA offline backup to ldif.gz at 2 AM every day
# - Ensure this crontab runs AFTER the daily automated backup (zdb) of the CA Directory Data DSAs
# - Review these two (2) tokens for DATA DSAs:  ($DXHOME/config/settings/impd.dxc  or ./impd_backup.dxc)
#   a)   Location:  set dxgrid-backup-location = "/opt/CA/Directory/dxserver/backup/";
#   b)   Online Backup Period:   dump dxgrid-db period 0 86400;
#
# Note1: The 'N' start time of the 'dump dxgrid-db period N M' is the offset in seconds from midnight of UTC
#   For 24 hr clock, 0130 (AM) CST calculate the following in UTC/GMT =>  0130 CST + 6 hours = 0730 UTC
#   Due to the six (6) hour difference between CST and UTC TZ:  7.5 * 3600 = 27000 seconds
# Example(s):
#   dump dxgrid-db period 19800 86400;   [Once a day at 2330 CST]
#   dump dxgrid-db period 27000 86400;   [Once a day at 0130 CST]
#
# Note2:  Alternatively, may force an online backup using this line:
#               dump dxgrid-db;
#        & issuing this command:  dxserver init all
#
#####################################################################
#        1      2         3       4       5        6
#       min     hr      d-o-m   month   d-o-w   command(s)
#####################################################################
#####
#####  Testing Backup Every Five (5) Minutes ####
#*/5 * * * *  . $HOME/.profile && dxdumpdb -z `dxserver status | grep "impd-main" | awk "{print $1}"` | gzip -9 > /tmp/`hostname`_`dxserver status | grep "impd-main" | awk '{print $1}'`_`/bin/date --utc +\%Y\%m\%d\%H\%M\%S.0Z`.ldif.gz
#####
#####  Backup daily at 2 AM CST  -  30 minutes after the online backup at 1:30 AM CST #####
#####
0 2 * * *    . $HOME/.profile &&  dxdumpdb -z `dxserver status | grep "impd-main"   | awk "{print $1}"` | gzip -9 > /tmp/`hostname`_`dxserver status | grep "impd-main"   | awk '{print $1}'`_`/bin/date --utc +\%Y\%m\%d\%H\%M\%S.0Z`.ldif.gz
0 2 * * *    . $HOME/.profile &&  dxdumpdb -z `dxserver status | grep "impd-co"     | awk "{print $1}"` | gzip -9 > /tmp/`hostname`_`dxserver status | grep "impd-co"     | awk '{print $1}'`_`/bin/date --utc +\%Y\%m\%d\%H\%M\%S.0Z`.ldif.gz
0 2 * * *    . $HOME/.profile &&  dxdumpdb -z `dxserver status | grep "impd-inc"    | awk "{print $1}"` | gzip -9 > /tmp/`hostname`_`dxserver status | grep "impd-inc"    | awk '{print $1}'`_`/bin/date --utc +\%Y\%m\%d\%H\%M\%S.0Z`.ldif.gz
0 2 * * *    . $HOME/.profile &&  dxdumpdb -z `dxserver status | grep "impd-notify" | awk "{print $1}"` | gzip -9 > /tmp/`hostname`_`dxserver status | grep "impd-notify" | awk '{print $1}'`_`/bin/date --utc +\%Y\%m\%d\%H\%M\%S.0Z`.ldif.gz

Example of the above lines that can be placed in a bash shell, instead of called directly via crontab. Note: Able to use variables and no need to escape the `date % characters `

# set DSA=main &&   dxdumpdb -z `dxserver status | grep "impd-$DSA" | awk '{print $1}'` | gzip -9 > /tmp/`hostname`_`dxserver status | grep "impd-$DSA" | awk '{print $1}'`_`/bin/date --utc +%Y%m%d%H%M%S.0Z`.ldif.gz
# set DSA=co &&     dxdumpdb -z `dxserver status | grep "impd-$DSA" | awk '{print $1}'` | gzip -9 > /tmp/`hostname`_`dxserver status | grep "impd-$DSA" | awk '{print $1}'`_`/bin/date --utc +%Y%m%d%H%M%S.0Z`.ldif.gz
# set DSA=inc &&    dxdumpdb -z `dxserver status | grep "impd-$DSA" | awk '{print $1}'` | gzip -9 > /tmp/`hostname`_`dxserver status | grep "impd-$DSA" | awk '{print $1}'`_`/bin/date --utc +%Y%m%d%H%M%S.0Z`.ldif.gz
# set DSA=notify && dxdumpdb -z `dxserver status | grep "impd-$DSA" | awk '{print $1}'` | gzip -9 > /tmp/`hostname`_`dxserver status | grep "impd-$DSA" | awk '{print $1}'`_`/bin/date --utc +%Y%m%d%H%M%S.0Z`.ldif.gz
#

Example of the output:

Monitor with tail -f /var/log/cron (or syslog depending on your OS version), when the crontab is executed for your ‘dsa’ account

View the output folder for the newly created gzip LDIF files. The files may be extracted back to LDIF format, via gzip -d file.ldif.gz. Compare these file sizes with the original (*.zdb) files of 2GB.

Recommendation(s):

Implement a similar process and retain this data for fourteen (14) days, to assist with any RCA or similar analysis that may be needed for historical data. Avoid copied the (*.db or *.zdb) files for backup, unless using this process to force a clean sync between peer MW Data DSAs.

The Data DSAs may be reloaded (dxloadb) from these LDIF snapshots; the LDIF files do not have the same file size impact as the binary db files; and as LDIF files, they may be quickly search for prior data using standard tools such as grep “text string” filename.ldif.

This process will assist in site preparation for a DAR (disaster and recovery) scenario. Protect your data.

References:

dxdumpdb

https://techdocs.broadcom.com/content/broadcom/techdocs/us/en/ca-enterprise-software/layer7-identity-and-access-management/directory/14-1/administrating/tools-to-manage-ca-directory/dxtools/dxdumpdb-tool-export-data-from-a-datastore-to-an-ldif-file.html

dump dxgrid-db

https://techdocs.broadcom.com/content/broadcom/techdocs/us/en/ca-enterprise-software/layer7-identity-and-access-management/directory/14-1/reference/commands-reference/dump-dxgrid-db-command-take-a-consistent-snapshot-copy-of-a-datastore.html

If you wish to learn more or need assistance, contact us.