rfc:normalize-array-auto-increment-on-copy-on-write

This is an old revision of the document!


PHP RFC: Normalize array's "auto-increment" value on copy on write

Introduction

If two arrays are equal/identical, they should remain equal/identical arrays after the same array_push(..., $val) call is executed on both of them:

assert($array1 === $array2); // identical/equal
$array1[] = $array2[] = 123;
assert($array1 === $array2); // still identical/equal

This is currently not guaranteed, and due to PHP arrays' nature, it is not possible to always enforce this property -- but it should be, at least in some dangerous cases.

When an array is assigned to a new reference, and it is copied, before a modification, due to the copy-on-write behavior, it will result in an array that is identical in any way to the original one; in fact, the copy also covers the “auto-increment” value:

$array1 = [0, 1, 2];
unset($array1[1], $array1[2]);
 
$array2 = $array1;
assert($array2 === [0]);
$array2[] = "push"; // triggers COW and then pushes the new entry
 
print_r($array2);
// Array
// (
//     [0] => 0
//     [3] => push
// )

This behavior happens, unfortunately, also between different scopes. Our code can receive “broken” array-lists from third-parties that only appear to be well-indexed array-lists, but that in reality are not, because they were misused during their lifetime (for example, it was used unset() on them, instead of array_pop()).

As result of that, despite “copy on write”, the value-type semantics, and even a different scope, the following assertion can fail in some cases:

function test(array $array){
    if($array === [0, 1, 2]){
        $array[] = 3;
        assert($array === [0, 1, 2, 3]);
    }
}
 
// For example:
$poison = [0, 1, 2, 3];
unset($poison[3]);
test($poison);

Proposal

This RFC proposes to reset the “auto-increment” value in copies triggered by “copy on write”, in order to guarantee a deterministic behavior to foreign scopes especially. The “auto-increment” value of the new variable reference must be equivalent to the “auto-increment” value that the array would have if it was re-created entry by entry, as follows:

$array_copy = [];
foreach($array as $key => $value){
    $array_copy[$key] => $value;
}

The reset is not limited to new function scopes but any new by-value reference:

$array = [0, 1, 2, 3];
unset($array[3], $array[2]);
$arrayCopy = $array;
$arrayCopy[] = 2;
assert($arrayCopy === [0, 1, 2]); // this assertion must pass; it doesn't currently

Backward Incompatible Changes

This change is not backward compatible; code relying on the next element index being remembered between copies of copy-on-write will break. However, the proposed change should be considered a bug-fix, rather than a behavior change. In fact, it offers protection against array-lists that were misused with unset() instead of array_pop/_splice/_shift.

Proposed PHP Version(s)

Next PHP minor version

Proposed Voting Choices

Vote will require 2/3 majority

rfc/normalize-array-auto-increment-on-copy-on-write.1560984053.txt.gz · Last modified: 2019/06/19 22:40 by wesnetmo