Homoglyph Attacks: Confusing Characters to Obfuscate Logic

March 10, 2025

What Are Homoglyphs?

Homoglyphs are characters from different alphabets or Unicode blocks that look visually similar or identical to common characters. For example, the Cyrillic 'о' looks just like the Latin 'o', but they're different characters with different Unicode values.

In code, homoglyphs can be strategically substituted for operators, variable names, or other syntax elements, altering program behavior while maintaining a visually correct appearance.

Real-World Examples

The Not-Equal Operator Switch

Let's look at a classic example:

// What you think you see:
if (environment != ENV_PROD) {
  // Enable development features
  enableDevMode();
}

// What's actually in the code:
if (environmentǃ = ENV_PROD) {
  // This code ALWAYS runs!
  enableDevMode();
}

In the second example, the exclamation mark is actually Unicode character U+01C3 (Latin letter "ǃ" - an alveolar click), not the standard ASCII exclamation mark (U+0021). This turns what appears to be a comparison environment != ENV_PROD into an assignment environmentǃ = ENV_PROD followed by a truthy check of that assignment's result.

Since the assignment returns the value of ENV_PROD (which is presumably truthy), the condition is always satisfied, enabling development features in all environments - potentially including production!

Variable Name Confusion

Consider this Python code:

username = get_authenticated_user()
usernаme = "admin"  # Notice anything?

if check_admin_privileges(username):
    grant_admin_access()

The second variable uses a Cyrillic 'а' (U+0430) instead of Latin 'a' (U+0061). They look identical in most fonts, but they're different characters. The privileges check uses the legitimate username variable, but a malicious actor has created a separate variable usernаme with admin privileges.

Function Hijacking

function validatePassword(password) {
    // Legitimate security checks
    return password.length >= 8 && /[A-Z]/.test(password) && /[0-9]/.test(password);
}

function vаlidatePassword(password) {
    // Malicious backdoor function with Cyrillic 'а'
    return true;
}

// Later in the code:
if (vаlidatePassword(userInput)) {  // Using the backdoor function!
    grantAccess();
}

The second function uses a Cyrillic 'а' instead of a Latin 'a'. If a developer uses this function (perhaps through copy-paste or by mistake), it bypasses all security checks.

Mathematical Operator Substitution

// What appears to be subtraction:
let total = price - discount;

// What's actually happening:
let total = price  discount;  // Using U+2212 (MINUS SIGN) instead of hyphen-minus

While these might behave the same in JavaScript, in some languages or contexts, the difference could cause unexpected behavior.

Equals Operator Confusion

// What you think you're looking at:
if (userRole == "admin") {
    // Grant admin privileges
}

// What's actually happening:
if (userRole  "admin") {  // Using U+2A75 (TWO CONSECUTIVE EQUALS SIGNS)
    // This might not work as expected!
}

Common Homoglyphs in Programming

Here are some frequently used homoglyphs in malicious code:

Normal CharacterHomoglyphUnicodeName
!ǃU+01C3LATIN LETTER RETROFLEX CLICK
aаU+0430CYRILLIC SMALL LETTER A
oоU+043ECYRILLIC SMALL LETTER O
eеU+0435CYRILLIC SMALL LETTER IE
-U+2212MINUS SIGN
+U+FF0BFULLWIDTH PLUS SIGN
=U+2A75TWO CONSECUTIVE EQUALS SIGNS
/U+FF0FFULLWIDTH SOLIDUS
*U+2217ASTERISK OPERATOR

Defending Against Homoglyph Attacks

  1. Use linters and static analysis tools: Many modern tools can detect non-ASCII characters in source code.
  2. Enable syntax highlighting: Good syntax highlighting can make it easier to spot operators that aren't behaving as expected.
  3. Check your code checksums: If your source code suddenly has unexpected byte differences, it might indicate tampering.

Homoglyph attacks are particularly interesting because they exploit human perception rather than technical vulnerabilities. They remind us that security isn't always about protecting against sophisticated attacks, sometimes its hiding in plain sight.

References: