Is Copy-n-Paste operations impacting your Identity & Governance solutions?

Microsoft Office Suite’s Autocorrect: How Character Replacements Impact Identity and Governance Solutions => Garbage-In-Garbage-Out (GIGO)

When thinking about identity and governance solutions, many of us consider factors such as password security, multi-factor authentication, or access control. Rarely do we contemplate the subtle implications of character replacements in our word processing software. However, Microsoft Office Suite’s Autocorrect feature, while intended to enhance the user experience, has introduced concerns around the copy-paste process, especially with characters like the dash and quotes. Let’s delve into the nuances of this issue and its potential impacts for two (2) of the most common replacements that have impact.

A Common Scenario:

Automated emails from Ticket Systems are forwarded to administrators or users, then these admin/users may copy-n-paste these strings from the email (or MS word document) to an identity / governance solution, as they wish to be efficient and ensure no mistyped characters happen from one solution to another. These fields could be used for provisioning access by a business role name or kicking off a governance campaign search.

Dash vs. Emdash: What’s the Big Deal?

Microsoft Word (and other programs within the Office Suite) has a habit of automatically converting the standard dash (-) to an emdash (—) when it assumes the user is attempting to create a longer break in the sentence. On the surface, this appears to be a simple formatting choice. Yet, when you copy content containing these characters and paste them into identity or governance platforms, unexpected issues may arise. This “emdash” decision appears to be following British style formatting per this reference.

Identity systems often depend on exact character matching for elements like usernames, role names, domain names, or system strings. For instance, if a user is instructed to input “” but inadvertently pastes “domain—” (with an emdash), the system will not recognize the latter as a valid entry. This leads to failed authentication attempts, locked accounts, and potential security concerns as users and admins scramble to correct the discrepancies. Worst case, the identity/governance solution is using UTF-8 or newer character sets to accept the special characters, but the underlying IG/IM database is still using older ASCII format, that do not recognize the newer character sets. If this occurs, then a data clean up operation is typically needed by the IM/IG/DBA teams.

The Smart Quotes Dilemma

Similarly, Microsoft’s Autocorrect feature replaces standard double quotes (“) with smart quotes (“ ”) for a more visually appealing look in documents. While they may enhance the aesthetic feel of a document, smart quotes can wreak havoc in systems expecting the simpler ASCII version.

A code or script that depends on specific string matching will fail if smart quotes are used instead of standard quotes. This can lead to malfunctioning applications, scripts, or integrations when developers or administrators copy and paste content from Office documents directly into configuration files or codebases.

Governance Solutions and Data Integrity

In governance solutions, consistency and data integrity are of the utmost importance. Consider a scenario where policy documents or terms of use agreements are drafted in Word. Any auto-replaced characters might be unintentionally added to official records or database entries. When such documents are parsed or processed by automated systems, unexpected behaviors might occur due to these seemingly innocuous character changes.

Recommendations and Best Practices:

  1. Awareness: Ensure that your team is aware of these auto-corrections. Training sessions or instructional guides can be used to inform users about these pitfalls.
  2. Disable Autocorrect: If you frequently copy and paste between Office Suite and other platforms, consider disabling these specific autocorrect features for these two (2) common ones (dash/quotes). See the below screen shots how to disable these two (2) features in MS Outlook, MS Word, and MS Powerpoint. Fortunately, we do not have to modify MS Excel. From a global updates, companies may wish to visit their patch process, to update the MS registry for these auto correction behavior for all users.
  3. Post-Copy Verification: After pasting content, always double-check critical characters to ensure they have not been auto-replaced. It may be necessary to incorporate policy verification rules to prevent entry of these two (2) common replacement characters, e.g. PX Policy UI data verification rules.
  4. Use Plain Text Editors: When dealing with sensitive or system-related information, use plain text editors like Notepad, Notepad++ or VSCode to avoid any auto-formatting.

Location of auto-correction of dash (-) to emdash (–) & quotes in MS Outlook

Location of auto-correction of dash (-) to emdash (–) & quotes in MS Word

Location of auto-correction of dash (-) to emdash (–) & quotes in MS Powerpoint

Fortunately, we do NOT have this issue in MS Excel for the two (2) characters we are reviewing in this blog.

An impact of copy-n-paste:

For example, if you are using an Oracle database, and you may see upside down question mark characters ¿ in your data sets, this is a strong indicator that the database is doing an auto-replacement for the special characters that it does not recognize. The below example showcases when users/administrators would use copy-n-paste operations to create new IM/IG objects, that would not be returned when searching later upon these objects, as the names would no longer match what was entered the 1st time.

If the database has a default character map, this effort will not be simple, as the DBAs must make a major change and will require an outage window. The DBAs may also need to be involved in the data clean up or replacement exercise to adjust the malformed entries.


The Microsoft Office Suite’s Autocorrect feature demonstrates how even well-intentioned, user-friendly functionalities can introduce unforeseen challenges. For those operating in the realm of identity and governance, an awareness of these issues is essential. It’s a testament to the intricate nature of modern software environments, where even the simplest character can have significant implications. Confirm your identity access / governance solutions have a matching character set between the solution stack and the underlying database.