Global Password Reset

The recent DNS challenges for a large organization that impacted their worldwide customers bring to mind a project we completed this year, a global password reset redundancy solution.

We worked with a client who desired to manage unplanned WAN outages to their five (5) data centers for three (3) independent MS Active Directory Domains with integration to various on-prem applications/ endpoints. The business requirement was for self-service password sync, where the users’ password change process is initialed/managed by the two (2) different MS Active Directory Password Policies.

Without the WAN outage requirement, any IAM/IAG solution may manage this request within a single data center. A reverse password sync agent process is enabled on all writable MS Active Directory domain controllers (DC). All the world-wide MS ADS domain controllers would communicate to the single data center to validate and resend this password change to all of the users’ managed endpoint/application accounts, e.g. SAP, Mainframe (ACF2/RACF/TSS), AS/400, Unix, SaaS, Database, LDAP, Certs, etc.

With the WAN outage requirement, however, a queue or components must be deployed/enabled at each global data center, so that password changes are allowed to sync locally to avoid work-stoppage and async-queued to avoid out-of-sync password to the other endpoint/applications that may be in other data centers.

We were able to work with the client to determine that their current IAM/IAG solution would have the means to meet this requirement, but we wished to confirm no issues with WAN latency and the async process. The WAN latency was measured at less than 300 msec between remote data centers that were opposite globally. The WAN latency measured is the global distance and any intermediate devices that the network traffic may pass through.

To review the solution’s ability to meet the latency issues, we introduced a test environment to emulate the global latency for deployment use-cases, change password use-cases, and standard CrUD use-cases. There is a feature within VMWare Workstation, that allows emulation of degraded network traffic. This process was a very useful planning/validation tool to lower rollback risk during production deployment.

VMWare Workstation Network Adapter Advance Settings for WAN latency emulation

The solution used for the Global Password Rest solution was Symantec Identity Suite Virtual Appliance r14.3cp2. This solution has many tiers, where select components may be globally deployed and others may not.

We avoided any changes to the J2EE tier (Wildfly) or Database for our architecture as these components are not supported for WAN latency by the Vendor. Note: We have worked with other clients that have deployment at two (2) remote data centers within 1000 km, that have reported minimal challenges for these tiers.

We focused our efforts on the Provisioning Tier and Connector Tier. The Provisioning Tier consists of the Provisioning Server and Provisioning Directory.

The Provisioning Server has no shared knowledge with other Provisioning Servers. The Provisioning Directory (Symantec Directory) is where the provisioning data may be set up in a multi-write peer model. Symantec Directory is a proper X.500 directory with high redundancy and is designed to manage WAN latency between remote data centers and recovery after an outage. See example provided below.

https://techdocs.broadcom.com/us/en/symantec-security-software/identity-security/directory/14-1/ca-directory-concepts/directory-replication/multiwrite-mw-replication.html

The Connector Tier consists of the Java Connector Server and C++ Connector Server, which may be deployed on MS Windows as an independent component. There is no shared knowledge between Connector Servers, which works in our favor.

Requirement:

Three (3) independent MS Active Directory domain in five (5) remote data centers need to allow self-service password change & allow local password sync during a WAN outage. Passwords changes are driven by MS ADS Password Policies (every N days). The IME Password Policy for IAG/IAM solution is not enabled, IME authentication is redirected to an ADS domain, and the IMPS IM Callback Feature is disabled.

Below is an image that outlines the topology for five (5) global data centers in AMER, EMEA, and APAC.

The flow diagram below captures the password change use-case (self-service or delegated), the expected data flow to the user’s managed endpoints/applications, and the eventual peer sync of the MS Active Directory domain local to the user.

Observation(s):

The standalone solution of Symantec IAG/IAM has no expected challenges with configurations, but the Virtual Appliance offers pre-canned configurations that may impact a WAN deployment.

During this project, we identified three (3) challenges using the virtual appliance.

Two (2) items needed the assistance of the Broadcom Support and Engineering teams. They were able to work with us to address deployment configuration challenges with the “check_cluster_clock_sync -v ” process that incorrectly increments time delays between servers instead of resetting a value of zero between testing between servers.

Why this is important? The “check_cluster_clock_sync” alias is used during auto-deployment of vApp nodes. If the time reported between servers is > 15 seconds then replication may fail. This time check issue was addressed with a hotfix. After the hot-fix was deployed, all clock differences were resolved.

The second challenge was a deployment challenge of the IMPS component for its embedded “registry files/folders”. The prior embedded copy process was observed to be using standard “scp”. With a WAN latency, the scp copy operation may take more than 30 seconds. Our testing with the Virtual Appliance showed that a simple copy would take over two (2) minutes for multiple small files. After reviewing with CA support/engineering, they provided an updated copy process using “rsync” that speeds up copy performance by >100x. Before this update, the impact was provisioning tier deployment would fail and partial rollback would occur.

The last challenge we identified was using the Symantec Directory’s embedded features to manage WAN latency via multi-write HUB groups. The Virtual Appliance cannot automatically manage this feature when enabled in the knowledge files of the provisioning data DSAs. Symantec Directory will fail to start after auto-deployment.

Fortunately, on the Virtual appliance, we have full access to the ‘dsa’ service ID and can modify these knowledge files before/after deployment. Suppose we wish to roll back or add a new Provisioning Server Virtual Appliance. In that case, we must disable the multi-write HUB group configuration temporarily, e.g. comment out the configuration parameter and re-init the DATA DSAs.

Six (6) Steps for Global Password Reset Solution Deployment

We were able to refine our list of steps for deployment using pre-built knowledge files and deployment of the vApp nodes in blank slates with the base components of Provisioning Server (PS) and Provisioning Directory) with a remote MS Windows server for the Connector Server (JCS/CCS).

Step 1: Update Symantec Directory DATA DSA’s knowledge configuration files to use the multiple group HUB model. Note that multi-write group configuration is enabled within the DATA DSA’s *.dxc files. One Directory servers in each data center will be defined as a “HUB”.

Ref: https://techdocs.broadcom.com/us/en/symantec-security-software/identity-security/directory/14-1/ca-directory-concepts/directory-replication/multiwrite-mw-groups-hubs/topology-sample-and-disaster-recovery.html

To assist this configuration effort, we leveraged a serials of bash shell scripts that could be pasted into multiple putty/ssh sessions on each vApp to replace the “HUB” string with a “sed” command.

After the HUB model is enabled (stop/start the DATA DSAs), confirm that delayed WAN latency has no challenge with Symantec Directory sync processes. By monitoring the Symantec Directory logs during replication, we can see that sync operation with the WAN latency is captured with the delay > 1 msecs between data centers AMER1 and APAC1.

Step 2: Update IMPS configurations to avoid delays with Global Password Reset solution.

Note for this architecture, we do not use external IME Password Policies. We ensure that each AD endpoint has the checkbox enabled for “Password synchronization agent is installed” & each Global User (GU) has “Enable Password Synchronization Agent” checkbox enabled to prevent data looping. To ensure this GU attribute is always enabled, we updated an attribute under “Create Users Default Attributes”.

Step 3a: Update the Connector Tier (CCS Component)

Ensure that the MS Windows Environmental variables for the CCS connector are defined for Failover (ADS_FAILOVER) and Retry (ADS_RETRY).

Step 3b: Update the CCS DNS knowledge file of ADS DCs hostnames.

Important Note: Avoid using the refresh feature “Refresh DC List” within the IMPS GUI for the ADS Endpoint. If this feature is used, then a “merge” will be processed from the local CCS DNS file contents and what is defined within the IMPS GUI refresh process. If we wish to manage the redirection to local MS ADS Domain Controllers, we need to control this behavior. If this step is done, we can clean out the Symantec Directory of extra entries. The only negative aspect is the local password change may attempt to communicate to one of the remote MS ADS Domain Controllers that are not within the local data center. During a WAN outage, a user would notice a delay during the password change event while the CCS connector timed out the connection until it connected to the local MS ADS DC.

Step 3c: CCS ADS Failover

If using SSL over TCP 636 confirm the ADS Domain Root Certificate is deployed to the MS Windows Server where the CCS service is deployed. If using SASL over TCP 389 (if available), then no additional effort is required.

If using SSL over TCP 636, use the MS tool certlm.msc to export the public root CA Certificate for this ADS Domain. Export to base64 format for import to the MS Windows host (if not already part of the ADS Domain) with the same MS tool certlm.msc.

Step 4a: Update the Connector Tier for the JCS component.

Add the stabilization parameter “maxWait” to the JCS/CCS configuration file. Recommend 10-30 seconds.

Step 4b: Update JCS registration to the IMPS Tier

You may use the Virtual Appliance Console, but this has a delay when pulling the list of any JCS connector that may be down at this time of the check/submission. If we use the Connector Xpress UI, we can accomplish the same process much faster with additional flexibility for routing rules to the exact MS ADS Endpoints in the local data center.

Step 4c: Observe the IMPS routing to JCS via etatrans log during any transaction.

If any JCS service is unavailable (TCP 20411), then the routing rules process will report a value of 999.00, instead of a low value of 0.00-1.00.

Step 5: Update the Remote Password Change Agent (DLL) on MS ADS Domain Controllers (writable)

Step 6a: Validation of Self-Service Password Change to selected MS ADS Domain Controller.

Using various MS Active Directory processes, we can emulate a delegated or self-service password change early during the configuration cycle, to confirm deployment is correct. The below example uses MS Powershell to select a writable MS ADS Domain Controller to update a user’s password. We can then monitor the logs at all tiers for completion of this password change event.

A view of the password change event from the Reverse Password Sync Agent log file on the exact MS Domain Controller.

Step 6b: Validation of password change event via CCS ADS Log.

Step 6c: Validation of password change event via IMPS etatrans log

Note: Below screenshot showcases alias/function to assist with monitoring the etatrans logs on the Virtual Appliance.

Below screen shot showcases using ldapsearch to check timestamps for before/after of password change event within MS Active Directory Domain.

We hope these notes are of some value to your business and projects.

Appendix

Using the MS Windows Server for CCS Server 

Get current status of AD account on select DC server before Password Change:

PowerShell Example:

get-aduser -Server dc2012.exchange2020.lab   "idmpwtest"  -properties passwordlastset, passwordneverexpires | ft name, passwordlastset

LdapSearch Example:  (using ldapsearch.exe from CCS bin folder - as the user with current password.)

C:\> & "C:\Program Files (x86)\CA\Identity Manager\Connector Server\ccs\bin\ldapsearch.exe" -LLL -h dc2012.exchange2012.lab -p 389 -D "cn=idmpwtest,cn=Users,DC=exchange2012,DC=lab" -w "Password05" -b "CN=idmpwtest,CN=Users,DC=exchange2012,DC=lab" -s base pwdLastSet

Change AD account's password via Powershell:
PowerShell Example:

Set-ADAccountPassword -Identity "idmpwtest" -Reset -NewPassword (ConvertTo-SecureString -AsPlainText "Password06" -Force) -Server dc2016.exchange.lab

Get current status of AD account on select DC server after Password Change:

PowerShell Example:

get-aduser -Server dc2012.exchange2020.lab   "idmpwtest"  -properties passwordlastset, passwordneverexpires | ft name, passwordlastset

LdapSearch Example:  (using ldapsearch.exe from CCS bin folder - as the user with NEW password)

C:\> & "C:\Program Files (x86)\CA\Identity Manager\Connector Server\ccs\bin\ldapsearch.exe" -LLL -h dc2012.exchange2012.lab -p 389 -D "cn=idmpwtest,cn=Users,DC=exchange2012,DC=lab" -w "Password06" -b "CN=idmpwtest,CN=Users,DC=exchange2012,DC=lab" -s base pwdLastSet

Using the Provisioning Server for password change event

Get current status of AD account on select DC server before Password Change:
LDAPSearch Example:   (From IMPS server - as user with current password)

LDAPTLS_REQCERT=never  ldapsearch -LLL -H ldaps://192.168.242.154:636 -D 'CN=idmpwtest,OU=People,dc=exchange2012,dc=lab'  -w  Password05   -b "CN=idmpwtest,OU=People,dc=exchange2012,dc=lab" -s sub dn pwdLastSet whenChanged


Change AD account's password via ldapmodify & base64 conversion process:
LDAPModify Example:

BASE64PWD=`echo -n '"Password06"' | iconv -f utf8 -t utf16le | base64 -w 0`
ADSHOST='192.168.242.154'
ADSUSERDN='CN=Administrator,CN=Users,DC=exchange2012,DC=lab'
ADSPWD='Password01!’

ldapmodify -v -a -H ldaps://$ADSHOST:636 -D "$ADSUSERDN" -w "$ADSPWD" << EOF
dn: CN=idmpwtest,OU=People,dc=exchange2012,dc=lab 
changetype: modify
replace: unicodePwd
unicodePwd::$BASE64PWD
EOF

Get current status of AD account on select DC server after Password Change:
LDAPSearch Example:   (From IMPS server - with user's account and new password)

LDAPTLS_REQCERT=never  ldapsearch -LLL -H ldaps://192.168.242.154:636 -D 'CN=idmpwtest,OU=People,dc=exchange2012,dc=lab' -w  Password06   -b "CN=idmpwtest,OU=People,dc=exchange2012,dc=lab" -s sub dn pwdLastSet whenChanged

Restart remote IMPD DATA DSAs after long outage

“DSA is attempting to start after a long outage, perform a recovery procedure before starting”

Challenge:   The IMPD (Identity Manager Provisioning Directory) Data DSAs have been offline for a while, e.g. 7 days+ (> 1 week), and the Symantec/CA Directory solution will, to protect the data, refuse to allow the DATA DSAs to start unless there is manual intervention to prevent the possibility of production data (Live DATA DSAs) being synced with older data (Offline DATA DSAs).

If we were concern, we would follow best practices and remove the offline DATA DSAs’ *.db & *.dp files, and replace the *.db with current copies of the Live DATA DSAs’ *.db files; generate temporary time files of *.dx and allow the time files of *.dp to rebuild themselves upon startup of the offline DATA DSAs.

Example to recover from an outage: https://anapartner.com/2020/08/21/directory-backup-and-restore-dar-scenarios/

However, if we are NOT concern, or the environment is non-production we can avoid the multiple shells, multiple commands to resync by using a combinations of bash shell commands. The proposal below outlines using the Symantec/CA Identity Suite virtual appliance, where both the IMPD and IMPS (Identity Manager Provisioning Server) components reside on the same servers.

Proposal:   Use a single Linux host to send remote commands as a single user ID; sudo to the ‘dsa’ and ‘imps’ service IDs, and issue commands to address the restart process.

Pre-Work:   For the Identity Suite vApp, recommend that .ssh keys be used to avoid using a password for the ‘config’ user IDs on all vApp nodes.

Example to setup .SSH keys for ‘config’ user ID: https://anapartner.com/2020/05/01/avoid-locking-a-userid-in-a-virtual-appliance/

If using .SSH keys, do not forget to use this shortcut to cache the local session: eval `ssh-agent` && ssh-add

Steps:   Issue the following bash commands with the correct IPs or hostnames.  

If possible, wrap the remote commands in a for-loop. The below example uses the local ‘config’ user ID, to ssh to remote servers, then issues a local su to the ‘dsa’ service ID. The ‘dsa’ commands may need to be wrapped as shown below to allow multiple commands to be executed together. We have a quick hostname check, stop all IMPD DATA DSAs, find the time-stamp file that is preventing the startup of the IMPD DATA DSAs and remove it, restart all IMPD DATA DSA, and then move on to the next server with the for-loop. The ‘imps’ commands are similar with a quick hostname check, status check, stop and start process, another status check, then move on to the next server in the for-loop.

for i in {136..141}; do ssh  -t config@192.168.242.$i "su - dsa -c \"hostname;dxserver stop all;pwd;find ./data/ -type f \( -name '*.dp' \) -delete  ;dxserver start all \" "; done

for i in {136..141}; do ssh  -t config@192.168.242.$i "su - imps -c \"hostname;imps status;imps stop;imps start;imps status \" "; done

View of for-loop commands output:

Additional: Process to assist with decision to sync or not sync.

Check if the number of total entries in each individual IMPD DATA DSA match with their peers (Multi-Write groups). Goal: Avoid any deltas > 1% between peers. The IMPD “main”, “co”, “inc” DATA DSA should be 100% in sync. We may see some minor flux in the “notify” DATA DSA, as this is temporary data used by the IMPS server to store data to be sent to the IME via the IME Call Back Process.

If there are any deltas, then we may export the IMPD DATA DSAs to LDIF files and then use the Symantec/CA Directory ldifdelta process to isolate and triage the deltas.

su - dsa    OR [ sudo -iu dsa ]
export HISTIGNORE=' *'             {USE THIS LINE TO FORCE HISTORY TO IGNORE ANY COMMANDS WITH A LEADING SPACE CHARACTER}
 echo -n Password01 > .impd.pwd ; chmod 600 .impd.pwd     {USE SPACE CHARACTER IN FRONT TO AVOID HISTORY USAGE}


# NOTIFY BRANCH (TCP 20404) 

for i in {135..140}; do echo "##########  192.168.242.$i IMPD NOTIFY DATA DSA ##########";LDAPTLS_REQCERT=never  dxsearch -LLL -H ldaps://192.168.242.$i:20404 -D 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' -y .impd.pwd -s sub -b 'dc=notify,dc=etadb' '(objectClass=*)' dxTotalEntryCount  |  perl -p00e 's/\r?\n //g' ; done

# INC BRANCH (TCP 20398)

for i in {135..140}; do echo "##########  192.168.242.$i IMPD INC DATA DSA ##########";LDAPTLS_REQCERT=never  dxsearch -LLL -H ldaps://192.168.242.$i:20398 -D 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' -y .impd.pwd -s sub -b 'eTInclusionContainerName=Inclusions,eTNamespaceName=CommonObjects,dc=im,dc=etadb' '(objectClass=*)' dxTotalEntryCount  |  perl -p00e 's/\r?\n //g' ; done

# CO BRANCH (TCP 20396)

for i in {135..140}; do echo "##########  192.168.242.$i IMPD CO DATA DSA ##########";LDAPTLS_REQCERT=never  dxsearch -LLL -H ldaps://192.168.242.$i:20396 -D 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' -y .impd.pwd -s sub -b 'eTNamespaceName=CommonObjects,dc=im,dc=etadb' '(objectClass=*)' dxTotalEntryCount  |  perl -p00e 's/\r?\n //g' ; done

# MAIN BRANCH (TCP 20394)

for i in {135..140}; do echo "##########  192.168.242.$i IMPD MAIN DATA DSA ##########";LDAPTLS_REQCERT=never  dxsearch -LLL -H ldaps://192.168.242.$i:20394 -D 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' -y .impd.pwd -s sub -b 'dc=im,dc=etadb' '(objectClass=*)' dxTotalEntryCount  |  perl -p00e 's/\r?\n //g' ; done


NOTIFY DSA is temporary data and will have deltas. This DSA is used for the IME CALL BACK process.

Authenticate to vApp ‘dsa’ user ID via ssh private key

The Symantec (CA) Identity Suite includes the Symantec (CA) Directory. This component is installed under the ‘dsa’ service ID. On the virtual appliance, this ‘dsa’ service ID does not have a password defined, and therefore no login is allowed.

As an enhancement, we would like to add in a SSH private key to allow authentication to the ‘dsa’ service ID from other virtual appliances and desktop usage with various tools, e.g. Putty, MobaXterm, WinSCP, etc. This enhancement will allow for a streamlined process to address out-of-sync Directory DATA DSAs with scp/Rsync copies without intermediate file shares or use of other service IDs.

Challenge:

The virtual appliance of Symantec (CA) Identity Suite r14.3 is built on CentOS 6.4. The OpenSSH services on this OS apparently do not use a private key format that can be used by desktop tools or the PuttyGen (keygen conversion tool). However, the private key may be used between vApp servers if using the FQDN (full qualified domain name). We noted that during testing, that localhost is not allowed due to localhost not defined in the SSHD “AllowedUsers” property file.

On newer virtual appliances vApp r14.4 with CentOS 8 Stream, this challenge does not exist, and we can use the OpenSSH private key, id_rsa, with the desktop tools as-is.

To assist with challenge and streamlining this process we have the following three (2) options:

Option 1: On newer OS, use OpenSSH process

After creating the private key, ./ssh/id_rsa, cat this file out to notepad, and save for use with the desktop tools

Generate this OpenSSH private/public key. The final command will help to validate this private key may be used for server to server communication.

echo y | ssh-keygen -t rsa -b 4096 -N Password02 -C "$USER@$HOSTNAME" -f .ssh/id_rsa ; ls -lart .ssh ; cat .ssh/id_rsa ; cat .ssh/id_rsa.pub >> .ssh/authorized_keys ; chmod 600 .ssh/authorized_keys ; ssh -v -i .ssh/id_rsa $USER@`hostname`

Option 2: Skip the OpenSSH process, use PuttyGen

On any OS (new/old) just use Putty-Gen tool to generate the private key. Update key comment/passphrase. After the private key is created, copy the TEXT “Public Key for pasting into OpenSSH authorized_keys file”. Just like it says, and then you may use the associated private key, id_rsa.ppk, with the desktop tools for the ‘dsa’ service ID.

Option 3: Combination of processes/tools

Important: .ssh/authorized_keys is updated and not overwritten.

Disaster Recovery Scenarios for Directories

Restore processes may be done with snapshots-in-time for both databases and directories. We wished to provide clarity of the restoration steps after a snapshot-in-time is utilized for a directory. The methodology outlined below has the following goals: a) allow sites to prepare before they need the restoration steps, b) provide a training module to exercise samples included in a vendor solution.

In this scenario, we focused on the CA/Broadcom/Symantec Directory solution. The CA Directory provides several tools to automate online backup snapshots, but these processes stop at copies of the binary data files.

Additionally, we desired to walk-through the provided DAR (Disaster and Recovery) scenarios and determine what needed to be updated to reflect newer features; and how we may validate that we did accomplish a full restoration.

Finally, to assist with the decision tree model, where we need to triage and determine if a full restore is required, or may we select partial restoration via extracts and imports of selected data.

Cluster Out-of-Sync Scenario

Awareness

The first indicator that a userstore (CA Directory DATA DSA) is out-of-sync will be the CA Directory logs themselves, e.g. alarm or trace logs.

Another indication will be inconsistent query results for a user object that returns different results when using a front-end router to the DATA DSAs.

After awareness of the issue, the team will exercise a triage process to determine the extent of the out-of-sync data. For a quick check, one may execute LDAP queries direct to the TCP port of each DATA DSA on each host, and examine the results directory or even the total number of entries, e.g. dxTotalEntryCount.

The returned count value will help determine if the number of entries for each DATA DSA on the peer MW hosts are out-of-sync for ADD or DEL operations. The challenge/GAP with this method is it will not show any delta due to modify operations on the user objects themselves, e.g. address field changed.

Example of LDAP queries (dxsearch/ldapsearch) to CA Directory DATA DSA for the CA Identity Management solution (4 DATA DSA and 1 ROUTER DSA)

su - dsa    OR [ sudo -iu dsa ]
echo -n Password01 > .impd.pwd ; chmod 600 .impd.pwd

# NOTIFY BRANCH (TCP 20404) 
LDAPTLS_REQCERT=never  dxsearch -LLL -H ldaps://`hostname`:20404 -D 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' -y .impd.pwd -s sub -b 'dc=notify,dc=etadb' '(objectClass=*)' dxTotalEntryCount
dn: dc=notify,dc=etadb

# INC BRANCH (TCP 20398)
LDAPTLS_REQCERT=never  dxsearch -LLL -H ldaps://`hostname`:20398 -D 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' -y .impd.pwd -s sub -b 'eTInclusionContainerName=Inclusions,eTNamespaceName=CommonObjects,dc=im,dc=etadb' '(objectClass=*)' dxTotalEntryCount

# CO BRANCH (TCP 20396)
LDAPTLS_REQCERT=never  dxsearch -LLL -H ldaps://`hostname`:20396 -D 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' -y .impd.pwd -s sub -b 'eTNamespaceName=CommonObjects,dc=im,dc=etadb' '(objectClass=*)' dxTotalEntryCount

# MAIN BRANCH (TCP 20394)
LDAPTLS_REQCERT=never  dxsearch -LLL -H ldaps://`hostname`:20394 -D 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' -y .impd.pwd -s sub -b 'dc=im,dc=etadb' '(objectClass=*)' dxTotalEntryCount

# ALL BRANCHES - Router Port (TCP 20391)
LDAPTLS_REQCERT=never  dxsearch -LLL -H ldaps://`hostname`:20391 -D 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' -y .impd.pwd -s sub -b 'dc=etadb' '(objectClass=*)' dxTotalEntryCount

# Scroll to see entire line 

A better process to identify the delta(s) will be automating the daily backup process, to build out LDIF files for each peer MW DATA DSA and then performing a delta process between the LDIF files. We will walk through this more involve step later in this blog entry.

Recovery Processes

The below link has examples from CA/Broadcom/Symantec with recovery notes of CA Directory DATA DSA that are out-of-sync due to extended downtime or outage window.

The below image pulled from the document (page 9.) shows CA Directory r12.x using the latest recovery processes of “multiwrite-DISP” (MW-DISP) mode.

This recovery process of MW-DISP is default for the CA Identity Management DATA DSAs via the install wizard tools, when they create the IMPD DATA DSAs.

https://knowledge.broadcom.com/external/article?articleId=54088

The above document is dated, and still mentions additional file structures that have been retired, e.g. oc/zoc, at,zat.

An enhancement request has been submitted for both of these requests:

https://community.broadcom.com/participate/ideation-home/viewidea?IdeationKey=c71a304b-a689-4894-ac1c-786c9a2b2d0d

The modified version we have started for CA Directory r14.x adds some clarity to the <dsaname>.dx files; and which steps may be adjusted to support the split data structure for the four (4) IMPD DATA DSAs.

The same time flow diagram was used. Extra notes were added for clarity, and if possible, examples of commands that will be used to assist with direct automation of each step (or maybe pasted in an SSH session window, as the dsa service ID).

Step 1, implicit in the identification/triage process, is to determine what userstore data is out-of-sync and how large a delta do we have. If the DSA service has been shut down (either deliberately or via a startup issue), if the shutdown delay is more than a few days, then the CA Directory process will check the date stamp in the <dsaname>.dp file and the transaction in the <dsaname>.tx file; if the dates are too large CA Directory will refuse to start the DATA DSA and issue a warning message.

Step 2, we will leverage the dxdisp <dsaname> command to generate a new time-stamp file <dsaname>.dx, that will be used to prevent unnecessary sync operations with any data older than the date stamp in this file. 

This command should be issued for every DATA DSA on the same host—Especially true for split DATA DSAs, e.g. IMPD (CA Identity Manager’s Provisioning Directories). In our example below, to assist with this step, we use a combination of commands with a while-loop to issue the dxdisp command.

This command can be executed regardless if the DSA is running or shutdown. If an existing <dsaname>.dx file exists, any additional execution of dxdisp will add updated time-stamps to this file.  

Note: The <dsaname>.dx file will be removed upon restart of the DATA DSA.

STEP 2: ISSUE DXDISP COMMAND [ Create time-stamp file for re-sync use ] ON ALL IMPD SERVERS.

su - dsa OR [ sudo -iu dsa ]
bash
dxserver status | grep -v router | awk '{print $1}' | while IFS='' read -r LINE || [ -n "$LINE" ] ; do dxdisp "$LINE" ;done ; echo ; find $DXHOME -name "*.dx" -exec ls -larth {} \;

# Scroll to see entire line 

Step 3 will then ask for an updated online backup to be executed. 

In earlier release of CA Directory, this required a telnet/ssh connection to the dxconsole of each DATA DSA. Or using the DSA configuration files to contain a dump dxgrid-db; command that would be executed with dxserver init all command. 

In newer releases of CA Directory, we can leverage the dxserver onlinebackup <dsaname> process. 

This step can be a challenge to dump all DATA DSAs at the same time, using manual procedures. 

Fortunately, we can automate this with a single bash shell process; and as an enhancement, we can also generate the LDIF extracts of each DATA DSA for later delta compare operations.

Note: The DATA DSA must be running (started) for the onlinebackup process to function correctly. If unsure, issue a dxserver status or dxserver start all prior. 

Retain the LDIF files from the “BAD” DATA DSA Servers for analysis.

STEP 3a-3c: ON ALL IMPD DATA DSA SERVERS - ISSUE ONLINE BACKUP PROCESS
su - dsa OR [ sudo -iu dsa ]
bash

dxserver status | grep started | grep -v router | awk '{print $1}' | while IFS='' read -r LINE || [ -n "$LINE" ] ; do dxserver onlinebackup "$LINE" ; sleep 10; dxdumpdb -w -z -f /tmp/`date '+%Y%m%d_%H%M%S_%s'`_$LINE.ldif $LINE ;done ; echo ; find $DXHOME -name "*.zdb" -exec ls -larth {} \; ; echo ; ls -larth --time-style=full-iso /tmp/*.ldif | grep  `date '+%Y-%m-%d'`

# Scroll to see entire line 

Step 4a Walks through the possible copy operations from “GOOD” to the “BAD” DATA DSA host, for the <dsaname>.zdb files. The IMPD DATA DSA will require that three (3) of four (4) zdb files are copied, to ensure no impact to referential integrity between the DATA DSA.

The preferred model to copy data from one remote host to another is via the compressed rsync process over SSH, as this is a rapid process for the CA Directory db / zdb files.

https://anapartner.com/2020/05/03/wan-latency-rsync-versus-scp/

Below are the code blocks that demonstrate examples how to copy data from one DSA server to another DSA server.

# RSYNC METHOD
sudo -iu dsa

time rsync --progress -e 'ssh -ax' -avz --exclude "User*" --exclude "*.dp" --exclude "*.tx" dsa@192.168.242.135:./data/ $DXHOME/data

# Scroll to see entire line 
# SCP METHOD   
sudo -iu dsa

scp   REMOTE_ID@$HOST:./data/<folder_impd_data_dsa_name>/*.zdb   /tmp/dsa_data
/usr/bin/mv  /tmp/dsa_data/<incorrect_dsaname>.zdb   $DXHOME/data/<folder_impd_data_dsa_name>/<correct_dsaname>.db

# Scroll to see entire line 

Step 4b Walk through the final steps before restarting the “BAD” DATA DSA.

The ONLY files that should be in the data folders are <dsaname>.db (binary data file) and <dsaname>.dx (ASCII time-stamp file). Ensure that the copied <prior-hostname-dsaname>.zdb file has been renamed to the correct hostname & extension for <dsaname>.db

Remove the prior <dsaname>.dp (ASCII time-stamp file) { the DATA DSA will auto replace this file with the *.dx file contents } and the <dsaname>.tx (binary data transaction file).

Step 5a Startup the DATA DSA with the command

dxserver start all

If there is any issue with a DATA or ROUTER DSA not starting, then issue the same command with the debug switch (-d)

dxserver -d start <dsaname>

Use the output from the above debug process to address any a) syntax challenges, or b) older PID/LCK files ($DXHOME/pid)

Step 5b Finally, use dxsearch/ldapsearch to query a unit-test of authentication with the primary service ID. Use other unit/use-case tests as needed to confirm data is now synced.

bash
echo -n Password01 > .impd.pwd ; chmod 600 .impd.pwd

LDAPTLS_REQCERT=never dxsearch -LLL -H ldaps://`hostname`:20394 -D 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' -y .impd.pwd -s base -b 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' '(objectClass=*)' | perl -p00e 's/\r?\n //g'

# Scroll to see entire line 

LDIF Recovery Processes

The steps above are for recovery via a 100% replacement method, where the assumption is that the “bad” DSA server does NOT have any data worth keeping or wish to be reviewed.

We wish to clarify a process/methodology, where the “peer” Multi-write DSA may be out-of-sync. Still, we are not sure “which” is truly the “good DSA” to select, or perhaps we wished to merge data from multiple DSA before we declare one to be the “good DSA” (with regards to the completeness of data).

Using CA Directory commands, we can join them together to automate snapshots and exports to LDIF files. These LDIF files can then be compared against their peers MW DATA DSA exports or even to themselves at different snapshot export times. As long as we have the LDIF exports, we can recover from any DAR scenario.

Example of using CA Directory dxserver and dxdumpdb commands (STEP 3) with the ldifdelta and dxmodify commands.

The output from ldifdelta may be imported to any remote peer MW DATA DSA server to sync via dxmodify to that hostname, to force a sync for the few objects that may be out-of-sync, e.g. Password Hashes or other.

dxserver status | grep started | grep -v router | awk '{print $1}' | while IFS='' read -r LINE || [ -n "$LINE" ] ; do dxserver onlinebackup "$LINE" ; sleep 10; dxdumpdb -z -f /tmp/`date '+%Y%m%d_%H%M%S_%s'`_$LINE.ldif $LINE ;done ; echo ; find $DXHOME -name "*.zdb" -exec ls -larth {} \; ; echo ; ls -larth --time-style=full-iso /tmp/*.ldif | grep  `date '+%Y-%m-%d'`

ldifdelta -x -S ca-prov-srv-01-impd-co  /tmp/20200819_122820_1597858100_ca-prov-srv-01-impd-co.ldif   /tmp/20200819_123108_1597858268_ca-prov-srv-01-impd-co.ldif  |  perl -p00e 's/\r?\n //g'  >   /tmp/delta_file_ca-prov-srv-01-impd-co.ldif   ; cat /tmp/delta_file_ca-prov-srv-01-impd-co.ldif

echo -n Password01 > .impd.pwd ; chmod 600 .impd.pwd
dxmodify -v -c -h`hostname` -p 20391  -D 'eTDSAContainerName=DSAs,eTNamespaceName=CommonObjects,dc=etadb' -y .impd.pwd -f /tmp/delta_file_ca-prov-srv-01-impd-co.ldif

# Scroll to see entire line 

The below images demonstrate a delta that exists between two (2) time snapshots. The CA Directory tool, ldifdelta, can identify and extract the modified entry to the user object.

The following examples will show how to re-import this delta using dxmodify command to the DATA DSA with no other modifications required to the input LDIF file.

In the testing example below, before any update to an object, let’s capture a snapshot-in-time and the LDIF files for each DATA DSA.

Lets make an update to a user object using any tool we wish, or command line process like ldapmodify.

Next, lets capture a new snapshot-in-time after the update, so we will be able to utilize the ldifdelta tool.

We can use the ldifdelta tool to create the delta LDIF input file. After we review this file, and accept the changes, we can then submit this LDIF file to the remote peer MW DATA DSA that are out-of-sync.

Hope this has value to you and any challenges you may have with your environment.

WAN Latency: Rsync versus SCP

We were curious about what methods we can use to manage large files that must be copied between sites with WAN-type latency and also restrict ourselves to processes available on the CA Identity Suite virtual appliance / Symantec IGA solution.

Leveraging VMware Workstation’s ability to introduce network latency between images, allows for a validation of a global password reset solution.

If we experience deployment challenges with native copy operations, we need to ensure we have alternatives to address any out-of-sync data.

The embedded CA Directory maintains the data tier in separate binary files, using a software router to join the data tier into a virtual directory. This allows for scalability and growth to accommodate the largest of sites.

We focused on the provisioning directory (IMPD) as our likely candidate for re-syncing.

Test Conditions:

  1. To ensure the data was being securely copied, we kept the requirement for SSH sessions between two (2) different nodes of a cluster.
  2. We introduce latency with VMware Workstation NIC for one of the nodes.

3. The four (4) IMPD Data DSAs were resized to 2500 MB each (a similar size we have seen in production at many sites).

4. We removed data and the folder structure from the receiving node to avoid any checksum restart processes from gaining an unfair advantage.

5. If the process allowed for exclusions, we did take advantage of this feature.

6. The feature/process/commands must be available on the vApp to the ‘config’ or ‘dsa’ userIDs.

7. The reference host/node that is being pulled, has the CA Directory Data DSAs offline (dxserver stop all) to prevent ongoing changes to the files during the copy operation.

Observations:

SCP without Compression: Unable to exclude other files (*.tx,*.dp, UserStore) – This process took over 12 minutes to copy 10,250 MB of data

SCP with Compression: Unable to exclude other files (*.tx,*.dp, UserStore) – This process still took over 12 minutes to copy 10,250 MB of data

Rsync without compression: This process can exclude files/folders and has built-in checksum features (to allow a restart of a file if the connection is broken) and works over SSH as well. If the folder was not deleted prior, then this process would give artificial high-speed results. This process was able to exclude the UserStore DSA files and the transaction files (*.dp & *.tx) that are not required to be copied for use on a remote server. Only 10,000 MB (4 x 2500 MB) was copied instead of an extra 250 MB.

Rsync with compression: This process can exclude files/folders and has built-in checksum features (to allow a restart of a file if the connection is broken) and works over SSH as well. This process was the winner, and; extremely amazing performance over the other processes.

Total Time: 1 min 10 seconds for 10,000 MB of data over a WAN latency of 70 ms (140 ms R/T)

Now that we have found our winner, we need to do a few post steps to use the copied files. CA Directory, to maintain uniqueness between peer members of the multi-write (MW) group, have a unique name for the data folder and the data file. On the CA Identity Suite / Symantec IGA Virtual Appliance, pseudo nomenclature is used with two (2) digits.

The next step is to rename the folder and the files. Since the vApp is locked down for installing other tools that may be available for rename operations, we utilized the find and mv command with a regular xpression process to assist with these two (2) steps.

Complete Process Summarized with Validation

The below process was written within the default shell of ‘dsa’ userID ‘csh’. If the shell is changed to ‘bash’; update accordingly.

The below process also utilized a SSH RSA private/public key process that was previously generated for the ‘dsa’ user ID. If you are using the vApp, change the userID to config; and su – dsa to complete the necessary steps. You may need to add a copy operation between dsa & config userIDs.

Summary of using rsync with find/mv to rename copied IMPD *.db files/folders
[dsa@pwdha03 ~/data]$ dxserver status
ca-prov-srv-03-impd-main started
ca-prov-srv-03-impd-notify started
ca-prov-srv-03-impd-co started
ca-prov-srv-03-impd-inc started
ca-prov-srv-03-imps-router started
[dsa@pwdha03 ~/data]$ dxserver stop all > & /dev/null
[dsa@pwdha03 ~/data]$ du -hs
9.4G    .
[dsa@pwdha03 ~/data]$ eval `ssh-agent` && ssh-add
Agent pid 5395
Enter passphrase for /opt/CA/Directory/dxserver/.ssh/id_rsa:
Identity added: /opt/CA/Directory/dxserver/.ssh/id_rsa (/opt/CA/Directory/dxserver/.ssh/id_rsa)
[dsa@pwdha03 ~/data]$ rm -rf *
[dsa@pwdha03 ~/data]$ du -hs
4.0K    .
[dsa@pwdha03 ~/data]$ time rsync --progress -e 'ssh -ax' -avz --exclude "User*" --exclude "*.dp" --exclude "*.tx" dsa@192.168.242.135:./data/ $DXHOME/data
FIPS mode initialized
receiving incremental file list
./
ca-prov-srv-01-impd-co/
ca-prov-srv-01-impd-co/ca-prov-srv-01-impd-co.db
  2500000000 100%  143.33MB/s    0:00:16 (xfer#1, to-check=3/9)
ca-prov-srv-01-impd-inc/
ca-prov-srv-01-impd-inc/ca-prov-srv-01-impd-inc.db
  2500000000 100%  153.50MB/s    0:00:15 (xfer#2, to-check=2/9)
ca-prov-srv-01-impd-main/
ca-prov-srv-01-impd-main/ca-prov-srv-01-impd-main.db
  2500000000 100%  132.17MB/s    0:00:18 (xfer#3, to-check=1/9)
ca-prov-srv-01-impd-notify/
ca-prov-srv-01-impd-notify/ca-prov-srv-01-impd-notify.db
  2500000000 100%  130.91MB/s    0:00:18 (xfer#4, to-check=0/9)
sent 137 bytes  received 9810722 bytes  139161.12 bytes/sec
total size is 10000000000  speedup is 1019.28
27.237u 5.696s 1:09.43 47.4%    0+0k 128+19531264io 2pf+0w
[dsa@pwdha03 ~/data]$ ls
ca-prov-srv-01-impd-co  ca-prov-srv-01-impd-inc  ca-prov-srv-01-impd-main  ca-prov-srv-01-impd-notify
[dsa@pwdha03 ~/data]$ find $DXHOME/data/ -mindepth 1 -type d -exec bash -c 'mv  $0 ${0/01/03}' {} \; > & /dev/null
[dsa@pwdha03 ~/data]$ ls
ca-prov-srv-03-impd-co  ca-prov-srv-03-impd-inc  ca-prov-srv-03-impd-main  ca-prov-srv-03-impd-notify
[dsa@pwdha03 ~/data]$ find $DXHOME/data -depth -name '*.db' -exec bash -c 'mv  $0 ${0/01/03}' {} \; > & /dev/null
[dsa@pwdha03 ~/data]$ dxserver start all
Starting all dxservers
ca-prov-srv-03-impd-main starting
..
ca-prov-srv-03-impd-main started
ca-prov-srv-03-impd-notify starting
..
ca-prov-srv-03-impd-notify started
ca-prov-srv-03-impd-co starting
..
ca-prov-srv-03-impd-co started
ca-prov-srv-03-impd-inc starting
..
ca-prov-srv-03-impd-inc started
ca-prov-srv-03-imps-router starting
..
ca-prov-srv-03-imps-router started
[dsa@pwdha03 ~/data]$ du -hs
9.4G    .
[dsa@pwdha03 ~/data]$

Note: An enhancement has been open to request that the ‘dsa’ userID is able to use remote SSH processes to address any challenges if the Data IMPD DSAs need to be copied or retained for backup processes.

https://community.broadcom.com/participate/ideation-home/viewidea?IdeationKey=7c795c51-d028-4db8-adb1-c9df2dc48bff

Example for vApp Patches:

Note: There is no major different in speed if the files being copied are already compressed. The below image shows that initial copy is at the rate of the network w/ latency. The value gain from using rsync is still the checksum feature that allow auto-restart where it left off.

vApp Patch process refined to a few lines (to three nodes of a cluster deployment)

# PATCHES
# On Local vApp [as config userID]
mkdir -p patches  && cd patches
curl -L -O ftp://ftp.ca.com/pub/CAIdentitySuiteVA/cumulative-patches/14.3.0/CP-VA-140300-0002.tar.gpg
curl -L -O ftp://ftp.ca.com/pub/CAIdentitySuiteVA/cumulative-patches/14.3.0/CP-IMV-140300-0001.tgz.gpg
screen    [will open a new bash shell ]
patch_vapp CP-VA-140300-0002.tar.gpg           [Patch VA prior to any solution patch]
patch_vapp CP-IMV-140300-0001.tgz.gpg
exit          [exit screen]
cd ..
# Push from one host to another via scp
IP=192.168.242.136;scp -r patches  config@$IP:
IP=192.168.242.137;scp -r patches  config@$IP:
# Push from one host to another via rsync over ssh          [Minor gain for compressed files]
IP=192.168.242.136;rsync --progress -e 'ssh -ax' -avz $HOME/patches config@$IP:
IP=192.168.242.137;rsync --progress -e 'ssh -ax' -avz $HOME/patches config@$IP:
# Pull from one host to another via rsync over ssh          [Minor gain for compressed files]
IP=192.168.242.135;rsync --progress -e 'ssh -ax' -avz config@$IP:./patches $HOME
# View the files were patched
IP=192.168.242.136;ssh -tt config@$IP "ls -lart patches"
IP=192.168.242.137;ssh -tt config@$IP "ls -lart patches"
# On Remote vApp Node #2
IP=192.168.242.136;ssh $IP
cd patches
screen    [will open a new bash shell ]
patch_vapp CP-VA-140300-0002.tar.gpg
patch_vapp CP-IMV-140300-0001.tgz.gpg
exit          [exit screen]
exit          [exit to original host]
# On Remote vApp Node #3
IP=192.168.242.137;ssh $IP
cd patches
screen    [will open a new bash shell ]
patch_vapp CP-VA-140300-0002.tar.gpg
patch_vapp CP-IMV-140300-0001.tgz.gpg
exit          [exit screen]
exit          [exit to original host]

View of rotating the SSH RSA key for CONFIG User ID

# CONFIG - On local vApp host
ls -lart .ssh     [view any prior files]
echo y | ssh-keygen -b 4096 -N Password01 -C $USER -f $HOME/.ssh/id_rsa
IP=192.168.242.135;ssh-keyscan -p 22 $IP >> .ssh/known_hosts
IP=192.168.242.136;ssh-keyscan -p 22 $IP >> .ssh/known_hosts
IP=192.168.242.137;ssh-keyscan -p 22 $IP >> .ssh/known_hosts
cp -r -p .ssh/id_rsa.pub .ssh/authorized_keys
rm -rf /tmp/*.$USER.ssh-keys.tar
tar -cvf /tmp/`/bin/date -u +%s`.$USER.ssh-keys.tar .ssh
ls -lart /tmp/*.$USER.ssh-keys.tar
eval `ssh-agent` && ssh-add           [Enter Password for SSH RSA Private Key]
IP=192.168.242.136;scp `ls /tmp/*.$USER.ssh-keys.tar`  config@$IP:
IP=192.168.242.137;scp `ls /tmp/*.$USER.ssh-keys.tar`  config@$IP:
USER=config;ssh -tt $USER@192.168.242.136 "tar -xvf *.$USER.ssh-keys.tar"
USER=config;ssh -tt $USER@192.168.242.137 "tar -xvf *.$USER.ssh-keys.tar"
IP=192.168.242.136;ssh $IP `/bin/date -u +%s`
IP=192.168.242.137;ssh $IP `/bin/date -u +%s`
IP=192.168.242.136;ssh -vv $IP              [Use -vv to troubleshoot ssh process]
IP=192.168.242.137;ssh -vv $IP 				[Use -vv to troubleshoot ssh process]

Be safe and automate your backups for CA Directory Data DSAs to LDIF

The CA Directory solution provides a mechanism to automate daily on-line backups, via one simple parameter:

dump dxgrid-db period 0 86400;

Where the first number is the offset from GMT/UTC (in seconds) and the second number is how often to run the backup (in seconds), e.g. Once a day = 86400 sec = 24 hr x 60 min/hr x 60 sec/min

Two Gaps/Challenge(s):

History: The automated backup process will overwrite the existing offline file(s) (*.zdb) for the Data DSA. Any requirement or need to perform a RCA is lost due to this fact. What was the data like 10 days ago? With the current state process, only the CA Directory or IM logs would be of assistance.

Size: The automated backup will create an offline file (*.zdb) footprint of the same size as the data (*.db) file. If your Data DSA (*.db) is 10 GB, then your offline (*.zdb) will be 10 GB. The Identity Provisioning User store has four (4) Data DSAs, that would multiple this number , e.g. four (4) db files + four (4) offline zdb files at 10 GB each, will require minimal of 80 GB disk space free. If we attempt to retain a history of these files for fourteen (14) days, this would be four (4) db + fourteen (14) zdb = eighteen (18) x 10 GB = 180 GB disk space required.

Resolutions:

Leverage the CA Directory tool (dxdumpdb) to convert from the binary data (*.db/*.zdb) to LDIF and the OS crontab for the ‘dsa’ account to automate a post ‘online backup’ export and conversion process.

Step 1: Validate the ‘dsa’ user ID has access to crontab (to avoid using root for this effort). cat /etc/cron.allow

If access is missing, append the ‘dsa’ user ID to this file.

Step 2: Validate that online backup process have been scheduled for your Data DSA. Use a find command to identify the offline files (*.zdb ). Note the size of the offline Data DSA files (*.zdb).

Step 3: Identify the online backup process start time, as defined in the Data DSA settings DXC file or perhaps DXI file. Convert this GMT offset time to the local time on the CA Directory server. (See references to assist)

Step 4: Use crontab -e as ‘dsa’ user ID, to create a new entry: (may use crontab -l to view any entries). Use the dxdumpdb -z switch with the DSA_NAME to create the exported LDIF file. Redirect this output to gzip to automatically bypass any need for temporary files. Note: Crontab has limited variable expansion, and any % characters must be escaped.

Example of the crontab for ‘dsa’ to run 30 minutes after (at 2 am CST) the online backup process is scheduled (at 1:30 am CST).

# Goal:  Export and compress the daily DSA offline backup to ldif.gz at 2 AM every day
# - Ensure this crontab runs AFTER the daily automated backup (zdb) of the CA Directory Data DSAs
# - Review these two (2) tokens for DATA DSAs:  ($DXHOME/config/settings/impd.dxc  or ./impd_backup.dxc)
#   a)   Location:  set dxgrid-backup-location = "/opt/CA/Directory/dxserver/backup/";
#   b)   Online Backup Period:   dump dxgrid-db period 0 86400;
#
# Note1: The 'N' start time of the 'dump dxgrid-db period N M' is the offset in seconds from midnight of UTC
#   For 24 hr clock, 0130 (AM) CST calculate the following in UTC/GMT =>  0130 CST + 6 hours = 0730 UTC
#   Due to the six (6) hour difference between CST and UTC TZ:  7.5 * 3600 = 27000 seconds
# Example(s):
#   dump dxgrid-db period 19800 86400;   [Once a day at 2330 CST]
#   dump dxgrid-db period 27000 86400;   [Once a day at 0130 CST]
#
# Note2:  Alternatively, may force an online backup using this line:
#               dump dxgrid-db;
#        & issuing this command:  dxserver init all
#
#####################################################################
#        1      2         3       4       5        6
#       min     hr      d-o-m   month   d-o-w   command(s)
#####################################################################
#####
#####  Testing Backup Every Five (5) Minutes ####
#*/5 * * * *  . $HOME/.profile && dxdumpdb -z `dxserver status | grep "impd-main" | awk "{print $1}"` | gzip -9 > /tmp/`hostname`_`dxserver status | grep "impd-main" | awk '{print $1}'`_`/bin/date --utc +\%Y\%m\%d\%H\%M\%S.0Z`.ldif.gz
#####
#####  Backup daily at 2 AM CST  -  30 minutes after the online backup at 1:30 AM CST #####
#####
0 2 * * *    . $HOME/.profile &&  dxdumpdb -z `dxserver status | grep "impd-main"   | awk "{print $1}"` | gzip -9 > /tmp/`hostname`_`dxserver status | grep "impd-main"   | awk '{print $1}'`_`/bin/date --utc +\%Y\%m\%d\%H\%M\%S.0Z`.ldif.gz
0 2 * * *    . $HOME/.profile &&  dxdumpdb -z `dxserver status | grep "impd-co"     | awk "{print $1}"` | gzip -9 > /tmp/`hostname`_`dxserver status | grep "impd-co"     | awk '{print $1}'`_`/bin/date --utc +\%Y\%m\%d\%H\%M\%S.0Z`.ldif.gz
0 2 * * *    . $HOME/.profile &&  dxdumpdb -z `dxserver status | grep "impd-inc"    | awk "{print $1}"` | gzip -9 > /tmp/`hostname`_`dxserver status | grep "impd-inc"    | awk '{print $1}'`_`/bin/date --utc +\%Y\%m\%d\%H\%M\%S.0Z`.ldif.gz
0 2 * * *    . $HOME/.profile &&  dxdumpdb -z `dxserver status | grep "impd-notify" | awk "{print $1}"` | gzip -9 > /tmp/`hostname`_`dxserver status | grep "impd-notify" | awk '{print $1}'`_`/bin/date --utc +\%Y\%m\%d\%H\%M\%S.0Z`.ldif.gz

Example of the above lines that can be placed in a bash shell, instead of called directly via crontab. Note: Able to use variables and no need to escape the `date % characters `

# set DSA=main &&   dxdumpdb -z `dxserver status | grep "impd-$DSA" | awk '{print $1}'` | gzip -9 > /tmp/`hostname`_`dxserver status | grep "impd-$DSA" | awk '{print $1}'`_`/bin/date --utc +%Y%m%d%H%M%S.0Z`.ldif.gz
# set DSA=co &&     dxdumpdb -z `dxserver status | grep "impd-$DSA" | awk '{print $1}'` | gzip -9 > /tmp/`hostname`_`dxserver status | grep "impd-$DSA" | awk '{print $1}'`_`/bin/date --utc +%Y%m%d%H%M%S.0Z`.ldif.gz
# set DSA=inc &&    dxdumpdb -z `dxserver status | grep "impd-$DSA" | awk '{print $1}'` | gzip -9 > /tmp/`hostname`_`dxserver status | grep "impd-$DSA" | awk '{print $1}'`_`/bin/date --utc +%Y%m%d%H%M%S.0Z`.ldif.gz
# set DSA=notify && dxdumpdb -z `dxserver status | grep "impd-$DSA" | awk '{print $1}'` | gzip -9 > /tmp/`hostname`_`dxserver status | grep "impd-$DSA" | awk '{print $1}'`_`/bin/date --utc +%Y%m%d%H%M%S.0Z`.ldif.gz
#

Example of the output:

Monitor with tail -f /var/log/cron (or syslog depending on your OS version), when the crontab is executed for your ‘dsa’ account

View the output folder for the newly created gzip LDIF files. The files may be extracted back to LDIF format, via gzip -d file.ldif.gz. Compare these file sizes with the original (*.zdb) files of 2GB.

Recommendation(s):

Implement a similar process and retain this data for fourteen (14) days, to assist with any RCA or similar analysis that may be needed for historical data. Avoid copied the (*.db or *.zdb) files for backup, unless using this process to force a clean sync between peer MW Data DSAs.

The Data DSAs may be reloaded (dxloadb) from these LDIF snapshots; the LDIF files do not have the same file size impact as the binary db files; and as LDIF files, they may be quickly search for prior data using standard tools such as grep “text string” filename.ldif.

This process will assist in site preparation for a DAR (disaster and recovery) scenario. Protect your data.

References:

dxdumpdb

https://techdocs.broadcom.com/content/broadcom/techdocs/us/en/ca-enterprise-software/layer7-identity-and-access-management/directory/14-1/administrating/tools-to-manage-ca-directory/dxtools/dxdumpdb-tool-export-data-from-a-datastore-to-an-ldif-file.html

dump dxgrid-db

https://techdocs.broadcom.com/content/broadcom/techdocs/us/en/ca-enterprise-software/layer7-identity-and-access-management/directory/14-1/reference/commands-reference/dump-dxgrid-db-command-take-a-consistent-snapshot-copy-of-a-datastore.html

If you wish to learn more or need assistance, contact us.