rfc:unserialize_warn_on_trailing_data

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
rfc:unserialize_warn_on_trailing_data [2023/03/19 22:26] timwollarfc:unserialize_warn_on_trailing_data [2023/05/01 17:08] (current) – Formatting timwolla
Line 1: Line 1:
-====== PHP RFC: Make unserialize() emit a warning for trailing data ======+====== PHP RFC: Make unserialize() emit a warning for trailing bytes ======
   * Version: 1.0   * Version: 1.0
-  * Date: 2022-10-31+  * Date: 2023-03-27
   * Author: Tim Düsterhus, timwolla@php.net   * Author: Tim Düsterhus, timwolla@php.net
-  * Status: Draft+  * Status: Implemented 
 +  * Target Version: PHP 8.3 
 +  * Implementation: https://github.com/php/php-src/commit/bf727cf5e206f567bc07fa1b5f331f2f269a6894
   * First Published at: https://wiki.php.net/rfc/unserialize_warn_on_trailing_data   * First Published at: https://wiki.php.net/rfc/unserialize_warn_on_trailing_data
  
 ===== Introduction ===== ===== Introduction =====
  
-PHP silently accepts trailing data after a valid serialization payload when unserializing. This is undesirable.+PHP silently accepts trailing bytes after a valid serialization payload when unserializing. This is undesirable.
  
 Once PHP’s unserialization parser finds the trailing delimiter of the serialized value (`'';''` for scalars and `''}''` for arrays and objects), parsing terminates, the serialized value is returned, and any additional input bytes are silently ignored. Once PHP’s unserialization parser finds the trailing delimiter of the serialized value (`'';''` for scalars and `''}''` for arrays and objects), parsing terminates, the serialized value is returned, and any additional input bytes are silently ignored.
  
-As <php>serialize</php> returns a valid serialization payload without trailing data, having trailing data within the input to <php>unserialize</php> implies that the serialization data was erroneously modified after-the-fact. Silently accepting such serialization data therefore is likely to mask (serious) issues within the application’s handling of (serialized) data and is therefore undesirable.+As <php>serialize</php> returns a valid serialization payload without trailing bytes, having trailing bytes within the input to <php>unserialize</php> implies that the serialization payload was //erroneously modified after-the-fact//. Silently accepting such serialization payload therefore is likely to mask (serious) issues within the application’s handling of (serialized) data and is therefore undesirable.
  
-One such issue would be overwriting existing serialized data with shorter serialized data, but without properly truncating the existing data to the shorter length. The newly written data will properly unserialize, because it is well-formed and because trailing data is silently ignored. However the old data will still partly be visible within the storage location, possibly exposing sensitive information that was meant to be deleted.+One such issue would be overwriting existing serialized data with shorter serialization payload, but without properly truncating the existing data to the shorter length. The newly written payload will properly unserialize, because it is well-formed and because trailing bytes are silently ignored. However the old data will still partly be visible within the storage location, possibly exposing sensitive information that was meant to be deleted.
  
 The recent CVE-2023-21036 (“aCropalypse”) is an example of this issue. When cropping screenshots on Google Pixel smartphones, the original screenshot file was overwritten in-place, without truncating the file to the smaller size of the cropped screenshot. For affected screenshots it is possible to restore (parts) of the original pixel data from the trailing bytes, exposing the information that is intended to be cropped. The recent CVE-2023-21036 (“aCropalypse”) is an example of this issue. When cropping screenshots on Google Pixel smartphones, the original screenshot file was overwritten in-place, without truncating the file to the smaller size of the cropped screenshot. For affected screenshots it is possible to restore (parts) of the original pixel data from the trailing bytes, exposing the information that is intended to be cropped.
  
-According to [[https://github.com/php/php-src/pull/9630#issuecomment-1272157809|a comment on the GitHub PR implementing this RFC]], the “igbinary” serializer already rejects serialization data with trailing bytes. Such far there was only one issue report, where the warning exposed an issue within the application’s file handling (missing 'b' flag on fopen). While “igbinary” is likely used much more rarely compared to PHP’s native serializer, this indicates that warning for trailing data exposes real issues, which is a good thing.+According to [[https://github.com/php/php-src/pull/9630#issuecomment-1272157809|a comment on the GitHub PR implementing this RFC]], the “igbinary” serializer already rejects serialization data with trailing bytes. Such far there was only one issue report, where the warning exposed an issue within the application’s file handling (missing 'b' flag on fopen). While “igbinary” is likely used much more rarely compared to PHP’s native serializer, this indicates that warning for trailing bytes exposes real issues, which is a good thing.
  
-Silently accepting trailing data can also be confusing to the human reader, when needing to debug issues with serialized data. The human reader might wonder why some information that is part of the serialized data does not appear within the unserialized return value.+Silently accepting trailing bytes can also be confusing to the human reader, when needing to debug issues with serialized data. The human reader might wonder why some information that is part of the serialized payload does not appear within the unserialized return value.
  
-Furthermore emitting a warning for trailing bytes makes it easier to extend the serialization format in the future when giving the trailing bytes a meaning. An example might be appending a 4-byte checksum to the serialized data to detect accidental corruption. If such a checksum would be implemented, any serialized data with existing trailing bytes would change their meaning. In the case of a checksum the result would be a failed unserialization after a PHP upgrade. For other format changes, the consequences might be more severe. By emitting a warning for trailing bytes, developer would be able to detect the issue before it actually becomes an issue.+Furthermore emitting a warning for trailing bytes makes it easier to extend the serialization format in the future when giving the trailing bytes a meaning. Existing serialized data with trailing bytes might either change its meaning or fail to unserialize. By emitting a warning for trailing bytes, the developer would be able to detect the issue before it actually becomes an issue.
  
 ===== Proposal ===== ===== Proposal =====
Line 100: Line 102:
 ===== Future Scope ===== ===== Future Scope =====
  
-  * Make this an error.+  * Make this an exception.
  
 ===== Proposed Voting Choices ===== ===== Proposed Voting Choices =====
  
-<doodle title="Make unserialize() emit a E_WARNING if the input contains trailing data?" auth="timwolla" voteType="single" closed="false" closeon="2022-01-01T00:00:00Z">+<doodle title="Make unserialize() emit a E_WARNING if the input contains trailing bytes?" auth="timwolla" voteType="single" closed="true" closeon="2023-04-26T08:30:00Z">
    * Yes    * Yes
    * No    * No
Line 115: Line 117:
 ===== Implementation ===== ===== Implementation =====
  
-n/a+  * https://github.com/php/php-src/commit/bf727cf5e206f567bc07fa1b5f331f2f269a6894
  
 ===== References ===== ===== References =====
Line 123: Line 125:
 ===== Rejected Features ===== ===== Rejected Features =====
  
-n/a+  * Throw an Exception instead of emitting warning.
rfc/unserialize_warn_on_trailing_data.1679264809.txt.gz · Last modified: 2023/03/19 22:26 by timwolla