DevOps – Lower project risk

We all wish to be out of the “install” business. How many times has your projects require installing or re-installing OS, solutions, and databases, & then perform post-install configurations, where it seems the same challenges are repeated.

Fortunately, with the SaaS services, virtualization of OS platforms, and solution appliances, the effort/time to install a component is on its way to being a small line item within a project plan.

However, in the enterprise software world, there are still components that have not been fully deployed in one of the above models, or the business/technical requirements do not fit the above models.

The value of DevOps to an enterprise project, is in two (2) areas:

  1. Automating deployment & re-deployment of solution components
    • First effort is always a wash for initial deployments, but value gained on the next environment(s)
    • Utilizing automated deployments ensure quality between peer members of components, e.g. install exactly the same way
    • Avoids the “too-many-cooks-in-the-kitchen” challenge when more than two (2) resource deploy in differ ways.
    • Allows rapid scaling of the solution with horizontally cluster integration.
  2. Automating deployment of business content from a dev to test to qa to pre-prod to eventually the production environment.
    • Hands on keyboard once
    • Workflow Approval/Reject of changes
    • Avoid downtime for business release cycles.
    • Allow for rollback of business content.
    • Lower project cost due to large maintenance windows.

DevOps does not have to be overwhelming. While customers’ may invest in enterprise DevOps solutions or open-source solution, like Chef, the proof-of-concept process that many follow first is using the embedded command line offerings in solutions to install solution components using the “silent” or “non-interactive” switches. For DevOPS with business content migration, the use of APIs via SOAP/REST or again CLI (command line interfaces) are used with script languages that customer staffs are familiar with.

Example of a command line to install Oracle Java JDK, that was then migrated to an enterprise DevOps solution:

A view of the architecture methodology we use for DEV-OPS of any vendor solution, e.g. CA Technologies, Oracle, RedHat, Microsoft, etc.

DevOps Architecture Methodology:  Bottom-Up Approach 

  • To meet the expected use of the CLI processes for migration into a commercial DevOps solution, the approach will attempt to emulate the same behavior.
  • Server(s) Acquisition –The servers are assumed to be compatible with the solution’s support matrices and to have the supported OS.
    • Most solution(s) are able to adequate run on 2-4 vCPU with 8-32 GB RAM.
      • Disk space may be from 10-100 GB.
    • A sandbox environment should attempt to run all solutions within a 2 vCPU with 8 GB RAM on 80 GB disk (20 GB for OS and 60 GB for data)
      • OS Mount Point
        • The majority of the solutions will deploy under /opt/<subfolder> on a Linux/UNIX OS.
        • This mount point will be chosen for all vendors and 3rd party solutions.
      • OS Libraries
        • OS libraries that are identified from literature, the installers, debug steps will be pre-loaded as a separate script.
      • OS Entropy
        • OS Entropy will be requested to be deployed prior to any installation via the OS RGND service or 3rd party RGND solutions. {Do NOT miss this step}

  • 3rd Party & vendor solution Installations
    • Any software declared as a predecessor solution, will be installed in the correct order to avoid re-work effort.
    • Any software solution that allow peer and/or cluster setup; will be deployed as a cluster member of 1; to allow future “stacks” to be integrated rapidly; and to allow a solution to scale horizontally.
    • Any co-location of software components, on the same server, will be isolated by folder, network ports, and JBOSS/Wildfly/J2EE instances.
    • Any software installation that require pre-installation steps will be identified and a process will be built via CLI to manage the general use-case of deployment and integration.
    • Any software installation that require input for service accounts; passwords; or other variables will be address with an input properties file and/or script variables that will be defined at the header of the script.
    • Any software installation that requires non-root account to install, will be identified and/or updated to use sudo access to execute as the non-root account.
    • Any software wizard installation that is not clear on “changes” or delta to the install base, will be execute several times to capture the deltas.
      • Process to capture install deltas (file based)
        • Install 1st time with interactive console mode. §tar/zip up the install folder
        • Install 2nd time with interactive console mode
          • Tar/zip up the install folder
        • Copy both files to a workstation/laptop and use a file/folder compare tool (WinMerge/Beyond Compare) to identify the file /folder deltas
    • Update software installation silent install scripts to use variables to manage the deltas.

Finally, we arrive at the DevOps Architecture Methodology:  Installation Processes

  • All solutions will be installed with CLI processes that may be executed as root or a non-root service account, as needed.
  • Interaction with Web Server/Web Application Servers will be managed with the CLI process of CuRL/MS Powershell/PDI/etc..
  • Bottom-Up approach will be used, where assumption of pre-installed components will be declared.
  • Installations scripts will follow the methodology:
    • Declare properties file and/or variables to be used.
    • Uninstall any prior installation
      • Execute shutdown script/process (if exists)
      • Execute OS kill command (search for running processes)
      • Execute uninstall script (provided by solution)
      • Remove installation folder
    • Install solution
      • Update silent install input file based on properties and/or script variables
      • Install solution via silent install input file
    • Perform base validation checks (query on files/folders/running processes)

Now we reach this section: DevOps Architecture Methodology:  Post Install / Integration

  • Realized & set/manage the expectation that there is a point-of-diminished returns for value of automating installations.
    • 1st Question:  Do we understand task A intent (requirements/business logic)?
    • 2nd Question:  Can task A be automated.  
    • 3rd Question:  Should task A be automated.
  • The post install and/or migration integration steps between components where the components reside on separate servers.
    • Otherwise the dev-ops process needs to be aware of the remote IP/hostnames of the cluster members.
      • May be pre-defined in  3rd Party & Vendor solution Installations
      • Note:  If “dummy hostnames” are to be used, ensure they are mapped to the current hostname to have them resolvable by DNS lookup.
  • Business logic unique to each customer may be pre-defined by use-case
    • The assumption is that well-defined business logic unique to each solution has been pre-built for consumption.
  • Service Accounts’ Passwords that were hard-coded as part of the silent install scripts will need to be immediately changed to new secure passwords prior to any production roll-out.

The project effort for DevOps above can be considered a “wash” with the expected manual installation effort for the first environment (dev) for OS & a vendors solution. The value will be realized / gained for project timelines due to deployment of the next 2-5 environments, when resources change, re-deployment is required, when there is a need to horizontally scale the solution, or if a disaster recovery site is required. Project timeline risk will be minimized, and the knowledge gained by the resources that build the DevOps processes will lower business risk during future triages of technical challenges.

The second value of DevOps is around business release process, that we will cover in another blog entry, to promote business logic/content from one environment to the next environment, until finally deployment to the production environment.

Connection test without telnet client

We have all worked on locked down hosts where the telnet client application is not installed and in the middle of a troubleshooting session comes the need to test network connectivity. But without the telnet client installed, it becomes challenging trying to validate network connectivity. We can utilize native tools and basic concepts to test this connectivity.

Below is how:

$ bash -c 'cat < /dev/null > /dev/tcp/'
$ echo $?

$? is a special shell variable that holds the exit status for the most recent foreground pipeline.

‘>’ is the redirect via the raw device /dev/<protocol>/<host>/<port>

An exit status of ‘0’ indicates success and any other value than ‘0’ will be an indication of failure to establish a connection. When the TCP socket is unavailable, it will take a while until the OS-defined timeout for the initiated connection to give up and most likely you will end up forcing an exit with Ctrl+C which also will yield a non-zero exit status.

Another one-liner can be used that will result in a ‘Port Open’ response only if the connection is successful.

$ bash -c 'cat < /dev/null > /dev/tcp/' && echo "Port Open"
Port Open

The next time you are stuck trying to figure out if a TCP port is open, and are without a telnet client, use these basics to validate connectivity.

Avoid Data Quality Issues during Testing (TDM)

Why do we see data quality challenge in lower environments (Test, Dev, QA) that we do not see in Production Environments?

If the project team was asked to set up lower environments for any new solution, it might be that the TDM (test-data-management) methodology is not a formal corporate process.

TDM may be simply described as capturing non-PII (sensitive) production data and coping a full or limited set of the data to the non-production environments. This non-PII data may be 1:1 or masked during this process.

A TDM (test-data-management) process for a new environment may be a challenge if there is no current production environment or that the current production environment is from a prior solution or M&A (merge/acquisitions).

While there are formal paid tools/solutions for TDM, a project team may wish to leverage CLI (command-line) and/or scripts to create this sub-set of non-PII production data for the lower environments.

This process may be as simple as deciding to export the full DIT (directory structure/directory information tree) of an LDAP store with all its current group names, but replace the userID/Full Name/sensitive data with “dummy/masked” data. This exported data would be loaded with the near-Production data, to allow for full use-case and negative use-case testing in the lower environments.

The Goal? Avoid show-stopper or high-level issues due to missed data quality concerns during a Go-Live or Business Release Cycle. This is very important when we have a small maintenance window to add new functionality.

Let us help with the knowledge transfer and building of representatives environments. We see this challenge often for the IAM solutions that manage 1000’s of endpoints, where even the basic Active Directory representation is missing the same DIT structure and group objects as the project AD domains, especially for M&A business projects.

Writing Successful Test Plans

One of the challenges we see is that project team members dislike writing.

Documentation that is very visible business owners/team leads, e.g. business/technical requirements, design, or project management, will not be greatly impacted due to the maturity of the senior resources.

However, one area seems to suffer and does have an impact for project timelines & future go-live estimates. Documentation for test plans may be very simplistic or detailed.

Project suffer timeline challenges when test plans & tests scripts are too simplistic.

The business QA resources assigned to execute the test plan/test scripts can NOT be assumed to have the in-depth background/knowledge of the solution. If the initial conditions and final output are not clearly called out (or how to reset them), then we have seen project timeline is drawn out as they are pushed into a seemingly never-ending cycle of QA testing.

To ensure your project is successful, demand that the test scripts for the test plans are written out as if to be executed by your great-grandparents. This includes which hyperlinks to use, which browser to use, which initial conditions to reset, which tool to reset to initial conditions, which steps to follow, how to record the final answer, where to capture the results, screenshot to be captured where and how.

The above methodology ensures that we do not have a “black box” of a solution, e.g. something-goes-in and we-hope-that-something-good-comes-out.

With the above process, the QA team lead can then scale out their team as needed.

When expected input/output information is captured, automated testing can be introduced with enhanced reporting and validation. This becomes exponentially valuable for IAM solution that manages 100’s of endpoints from legacy [AS/400, HP-NONSTOP NSK, Mainframe (ACF2/TSS/RACF/TSO)] to SaaS Cloud solutions.

So don’t contemplate, spend the time and reap the values. Make your grandparents proud!

Transparency through Automated Testing

One of the challenges that businesses have for projects is an awareness of the true status of tasks.

Project Methodology continues to advance with concepts of Agile Project Management which work well for larger projects. One of the value statements from Agile is the question to project resources when they can complete a task. This question provides a view into the mindset of the resource’s skill set and confidence to meet the task goal. If the resource is a junior resource or has limited skill in the task, then the effort provided to the team will be high. With Agile methodology using this process, it becomes very easy for resources, while they frantically research, to inadvertently drain the project bucket of effort, e.g., a 4-hour task that turns into a week duration.

Another area that has great success with enforcing transparency is automated testing. Automated testing may be used for unit, integration, use-case, performance, and scale testing. However, for project transparency, to lower business risk and project cost overrun, we would state the value of automated testing is from use-case & regression testing.

After technical and business requirements are complete, ensure that a project scheduled or WBS (work-breakdown-structure) has a defined milestone to migrate ALL manual use-case testing to automation. The effort to convert from manual use-case testing to automate testing will be considered by a few to have little value. However, when the final parts of a project are to meet a go-live over a weekend or to add in new business release with adjusted business logic. What would you trust to reach your goals 100%?

Below are two (2) common scenarios:

  1. Solution Upgrade Go-Live over a weekend. You have to be allocated 48 hours to backup of solution data & all platforms, perform a data snapshot, migrate data, integrate with newer solution components (possible new agents), combine with production data, and validate all use-cases for all business logic. And allow time for roll-back if, during triage of issues, the business team determines that show-stopper issues will not be addressed in the period. If you fail, you may be allowed one more attempt on another of your weekends, with all 2-20 people.
  2. Solution Business Release Cycle – Over a weekend or business day. You have the option to deploy new business logic to your solution. You can lower business risk to deploy during a business day but will require additional use-case and regression testing. If you have no automation, you will leverage a QA team of 2-10 people to exercise the use-cases; and sometimes negative use-cases.

Math: Assume your solution has twenty (20) use-cases & sub-use-cases where each use-case may have twenty (20) test scripts. Assume that you have an excellent QA/business/technical resources that have adequate capture the initial conditions (that must be reset every time) for each test script & they are checking for data quality challenges as well. Assume each test script takes about ten (10) minutes to execute, where your QA team resource (not the same skill set) will follow exactly and record success/failure. Perhaps you have trained them to use QA tools to screen capture your failure messages, and assign a technical project team resource to address.

20 use-case x 20 scripts/use-case x 10 min/script = 4000 minutes for one QA resource. Well, we have 1440 minutes in a day, so 4000/1440 = 2.78 days or 66.7 hours. Assume we add ten (10) QA business resources, while we have lower the QA cycle from 66.7 hours to 6.7 hours; we will be required to “freeze” any additional updates during this QA cycle; and likely impact our maintenance window for remediation of “found” issues for either scenario above.

Be aware of the “smoke” testing follies. This type of testing still leaves issues “burning.”

Enforce transparency for project owners, project managers, and team members.

Ensure that the effort to build the automated testing is kept for future regression when the new business logic phase is implemented. Prove to yourselves that prior business logic will NOT be impacted.

Many tools can be leveraged for automation, e.g., Open Source Jmeter (used by many customers), Selenium, or paid tools (Broadcom/CA Technologies Blazemeter), SOAPUI

Let us help.

We firmly believe, encourage, and perform knowledge transfer to our customers to help them succeed, and ensure that the introduction of automated testing lowers TCO of any solution. We can train your staff very quickly to leverage Jmeter from their desktop/servers to automate any written testing plans for solutions. These JMeter process can then be shared with all project team members.

Defining IAM Project Success

What makes an IAM project successful? A question that must be understood before taking on any complex multi-component integration that spans across people, process, and technology.

IAM projects are hard. They are hard because the objective is not just technical, it involves evaluation of business process implementations, it involves adoption to change. They are hard because it requires integrating with existing data on various systems be it legacy or modern. They are hard because IAM systems are powerful in that they can change data as they reside in a native system. It is hard because the risk of not diligently planning, designing, and implementing can be disastrous. They are hard because we need to resist the urge to start building something without requirements discussions, or a good understanding of the capability and deliverables. But a successful IAM program adds immense value to the business. Organizations that are looking to optimize business value are looking to tackle all of the above and more to reap fruits a successful IAM program yields.

Below are some key aspects to ensuring IAM Project Success

Communication and Expectations

It is all about clear communication and expectation setting. During the initial phases of requirements gathering and design discussions, open and transparent communication is a must. As an expert in IAM implementations, take the lead to communicate when there are gaps in the capability requested. Talk out if specific requirements can be met a different way to achieve the same business objective, or to think about passing out the required capability to another cycle if there is a time or resource limitation. Or set the expectation around additional ‘X’ needed if the request is to be pursued. This ‘X’ may be additional funding for resources to develop custom capability, added project deliverable risk, etc.

Communicate honestly and execute diligently on exceptions that are set. It always helps to continuously provide quick and honest feedback. If a project is to fail trying to accomplish too much given the time and resource, it was going to fail anyway, best to keep key stakeholder apprised of the risks up front — the frequent the communication around progress and risks, with clarity, the better the outcome.

Managing clear communications and expectations for requirements, design, decisions, and risks will help the entire team stay focused on the goal and be successful.

Plan of Execution

To be successful, it is imperative that the entire team is in agreement around the deliverable, expectations from each team member, expectations and support from the stakeholders. A project plan to track deliverables, get all members executing the tasks responsible and accountable is a must. In a large project with many moving parts, it is very easy to lose track of how to reach the goal line. Many sidebar issue and conversations will be in play, creating distractions. With a proper plan around execution, diligent upkeep of status, and everyone held accountable for their work streams instills trust in the team executing complex integrations.

Investment in an upfront plan on achieving the goals and open communication with the right stakeholders will pave the way to success.

Resource Planning

Resource Planning inherently is a part of the overall planning. We give particular emphasis to resource planning is to ensure there is an understanding of priorities while working with customer teams that may be involved in other day-to-day activities. When a timeline expectation is set, it can only be executed when resources involved in tasks have the cycles to get the work done.

Data Driven Testing

Investing in a test process that is data driven is vital. To get a data-driven test process, engaging technical and business stakeholder early in the process will reap delight. IAM systems change data in customers endpoint systems. To avoid surprises, tests should be executed on non-production systems and the expected changes to the data must be diligently validated. It is not enough to assume not getting an error message during a test cycle as a success. Nor is it okay to merely confirm the expected changes. It is essential to validate all changes to ensure side effects do not introduce additional unexpected changes.

Sign-off Process

A well defined sign-off process for every stage of the project is also essential to success. It keeps the stakeholder engaged and informed in all phases of the project. A sign-off process should also include an understanding of how to keep moving forward in case of a stalemate. In an IAM project, we will face instances where there is some issue that can cause delay. An objective evaluation on whether the problem is a show stopper for go-live must be done objectively. It just is better to have an open discussion during early phases of the project to discuss the challenges the team most likely is going to face, and a process that can help move forward and enable focused Sign-Off towards a successful go-live.

Operational Expertise

IAM implementations are complex. To reap the IAM program benefits a successful implementation is not enough. A skilled team that understands the execution from a business and technical perspective is required to ensure the continuity of excellence. If the client team is to be responsible for the upkeep and maintenance of the implementation, it is crucial they be engaged during all phases of the project. Understanding the implementation details will go a long way in tacking operational challenges.

Our team is here to help with every step to make your journey a successful one. Even if you are not working with us directly, we hope the article provides a blueprint towards a successful IAM program execution.