====== PHP RFC: Normalize arrays' "auto-increment" value on copy on write ======
* Version: 0.1
* Date: 2019-06-19
* Author: Wes (@WesNetmo)
* Status: Under discussion
* First Published at: http://wiki.php.net/rfc/normalize-array-auto-increment-on-copy-on-write
===== Introduction =====
If two ''array''s are equal/identical, they should remain equal/identical ''array''s after
the same ''array_push(..., $val)'' call is executed on both of them:
assert($array1 === $array2); // identical/equal
$array1[] = $array2[] = 123;
assert($array1 === $array2); // still identical/equal
This is currently not guaranteed, and because of ''array''s' all-doing nature, it is not
possible to always enforce this property -- but it should be in some dangerous cases,
namely when functions from (potential) different authors are interacting.
-----
When an ''array'' is assigned to a new reference, and it is copied, before a modification,
due to the copy-on-write behavior, it will result in an ''array'' that is identical in any
way to the original one, inclusive of its "auto-increment" value:
$array1 = [0, 1, 2];
unset($array1[1], $array1[2]);
$array2 = $array1;
assert($array2 === [0]);
$array2[] = "push"; // triggers COW and then pushes the new entry
print_r($array2);
// Array
// (
// [0] => 0
// [3] => push
// )
This happens also between different function scopes. Our functions can receive "broken"
''array''-lists from third-parties that only appear to be well-indexed, but that in
reality are not, because they were misused during their lifetime (classic example, it was
used ''unset($array[$lastIndex])'' on them, instead of ''array_pop($array)'').
As result of that, despite "copy on write", the value-type semantics, and even a different
scope, the following assertion can fail in some cases:
function test(array $array){
if($array === [0, 1, 2]){
$array[] = 3;
assert($array === [0, 1, 2, 3]);
}
}
// For example:
$poison = [0, 1, 2, 3];
unset($poison[3]);
test($poison);
===== Proposal =====
This RFC proposes to reset the "auto-increment" value in copies triggered by "copy on
write", in order to guarantee a deterministic behavior to foreign scopes especially. The
"auto-increment" value of the new variable reference must be equivalent to the
"auto-increment" value that the ''array'' would have if it was re-created entry by entry,
as follows:
$array_copy = [];
foreach($array as $key => $value){
$array_copy[$key] => $value;
}
The reset is not limited to new function scopes but any new by-value reference:
$array = [0, 1, 2, 3];
unset($array[3], $array[2]);
$arrayCopy = $array;
$arrayCopy[] = 2;
assert($arrayCopy === [0, 1, 2]); // this assertion must pass; it doesn't currently
===== Backward Incompatible Changes =====
This change is not backward compatible; code relying on the "auto-increment" value being
remembered between copies of copy-on-write will break. However, the proposed change should
be considered a bug-fix, rather than a behavior change; it offers protection against
''array''-lists that were misused with ''unset()'' instead of ''array_pop/_splice/_shift''
and thus will only affect code that is already a candidate for improvements. Furthermore,
the "auto-increment" value is copied inconsistently, when the ''array'' is empty:
$a = [0, 1];
unset($a[1]);
$b = $a;
$b[] = 2;
// $b is [0 => 0, 2 => 2]
$a = [0, 1];
unset($a[0], $a[1]);
$b = $a;
$b[] = 2;
// $b is [0 => 2], rather than [2 => 2]
The proposed change would make the behavior consistent and safer.
===== Proposed PHP Version(s) =====
7.4
===== Proposed Voting Choices =====
Vote will require 2/3 majority
===== References =====
* [[https://externals.io/message/105992|Pre-vote discussion on externals.io]]