This is an old revision of the document!
PHP RFC: Make unserialize() emit a warning for trailing data
- Version: 1.0
- Date: 2022-10-31
- Author: Tim Düsterhus, timwolla@php.net
- Status: Draft
- First Published at: https://wiki.php.net/rfc/unserialize_warn_on_trailing_data
Introduction
PHP silently accepts trailing data after a valid serialization payload when unserializing. This is undesirable.
Once PHP’s unserialization parser finds the trailing delimiter of the serialized value (`;
` for scalars and `}
` for arrays and objects), parsing terminates, the serialized value is returned, and any additional input bytes are silently ignored.
As serialize
returns a valid serialization payload without trailing data, having trailing data within the input to unserialize
implies that the serialization data was erroneously modified after-the-fact. Silently accepting such serialization data therefore is likely to mask (serious) issues within the application’s handling of (serialized) data and is therefore undesirable.
One such issue would be overwriting existing serialized data with shorter serialized data, but without properly truncating the existing data to the shorter length. The newly written data will properly unserialize, because it is well-formed and because trailing data is silently ignored. However the old data will still partly be visible within the storage, possibly exposing sensitive information that was meant to be deleted.
The recent CVE-2023-21036 (“aCropalypse”) is an example of this issue. When cropping screenshots on Google Pixel smartphones, the original screenshot file was overwritten in-place, without truncating the file to the smaller size of the cropped screenshot. For affected screenshots it is possible to restore (parts) of the original pixel data from the trailing bytes, exposing the information that is intended to be cropped.
According to a comment on the GitHub PR implementing this RFC, the “igbinary” serializer already rejects serialization data with trailing bytes. Such far there was only one issue report, where the warning exposed an issue within the application’s file handling (missing 'b' flag on fopen). While “igbinary” is likely used much more rarely compared to PHP’s native serializer, this indicates that warning for trailing data exposes real issues, which is a good thing.
Silently accepting trailing data can also be confusing to the human reader, when needing to debug issues with serialized data. The human reader might wonder why some information that is part of the serialized data does not appear within the unserialized return value.
Proposal
unserialize()
shall emit a new E_WARNING whenever the input string contains additional bytes once the unserialization parser terminates after successfully parsing a value. In other words: A warning shall be emitted if bytes can be removed from the end of the input string without changing the return value of unserialize()
.
Examples
<?php var_dump(unserialize('i:5;i:6;')); // Warning: unserialize(): Extra data starting at offset 4 of 8 bytes in %s on line %d // int(5) var_dump(unserialize('N;i:6;')); // Warning: unserialize(): Extra data starting at offset 2 of 6 bytes in %s on line %d // NULL var_dump(unserialize('b:1;i:6;')); // Warning: unserialize(): Extra data starting at offset 4 of 8 bytes in %s on line %d // bool(true) var_dump(unserialize('a:1:{s:3:"foo";b:1;}i:6;')); // Warning: unserialize(): Extra data starting at offset 20 of 24 bytes in %s on line %d // array(1) { // ["foo"]=> // bool(true) // }
Backward Incompatible Changes
A new warning will be emitted when the input to unserialize()
contains unconsumed bytes after successfully unserializing something.
For users that use a throwing error handler, this will result in unserialize()
failing for this type of modified input string. The documentation already states that passing strings that were not created by serialize()
must not be passed to unserialize()
, because doing so might introduce security issues.
Users that do not use a throwing error handler might just see the new warning.
Users might not notice anything at all if they suppress unserialize warnings (e.g. by calling @unserialize()
).
The new warning is the entire purpose of this RFC, thus the justification for the possible breaks is the reasoning given before.
Proposed PHP Version(s)
Next PHP 8.x (8.3).
RFC Impact
To SAPIs
None.
To Existing Extensions
The same impact as with userland code.
To Opcache
None.
New Constants
None.
php.ini Defaults
None.
Open Issues
None.
Unaffected PHP Functionality
Anything that isn’t unserialize()
or more generally related to serialization.
Future Scope
- Make this an error.
Proposed Voting Choices
Patches and Tests
Implementation
n/a
References
Rejected Features
n/a