rfc:array_unpacking_string_keys

PHP RFC: Array unpacking with string keys

Introduction

PHP 7.4 added support for unpacking inside arrays through the spread_operator_for_array RFC. At the time, unpacking of string keys was prohibited, both due to uncertainty regarding the semantics, and because the same limitation existed for argument unpacking at the time. In the meantime, the argument unpacking limitation has been removed by the introduction of named arguments. This RFC proposes to permit unpacking of string keys into arrays as well.

Proposal

There are multiple possible semantics for how string keys can be handled during unpacking. This RFC proposes to follow the semantics of the array_merge() function:

$array = [...$array1, ...$array2];
// Approximately the same as:
$array = array_merge($array1, $array2);

In particular this means that later string keys overwrite earlier ones:

$array1 = ["a" => 1];
$array2 = ["a" => 2];
$array = ["a" => 0, ...$array1, ...$array2];
var_dump($array); // ["a" => 2]

In this case, the key "a" occurs three times and the last occurrence with value 2 wins.

The current behavior of integer keys is not affected. That is, integer keys continue to be renumbered:

$array1 = [1, 2, 3];
$array2 = [4, 5, 6];
$array = [...$array1, ...$array2];
var_dump($array); // [1, 2, 3, 4, 5, 6]
// Which is [0 => 1, 1 => 2, 2 => 3, 3 => 4, 4 => 5, 5 => 6]
// where the original integer keys have not been retained.

Keys that are neither integers nor strings continue to throw a TypeError. Such keys can only be generated by Traversables.

Traversables may also generate integral string keys, which are canonicalized to integer keys by arrays. Such keys will be treated the same way as integer keys and renumbered:

function gen() {
    yield "3" => 1;
    yield "2" => 2;
    yield "1" => 3;
}
var_dump([...gen()]); // [1, 2, 3]

Alternatives

There are two potential alternative behaviors that are not proposed by this RFC. The first is to follow the semantics of the + operator on arrays, which will pick the first value for a given key, rather than last:

$array1 = ["a" => 1];
$array2 = ["a" => 2];
var_dump($array1 + $array2); // ["a" => 1]

It should be noted that the + operator does this for integer keys as well. As such, even if these semantics were adopted, the ... operator would not behave identically to + for the case of integer keys.

The second alternative is to discard string keys entirely, that is to use the “renumbering” behavior for all keys, rather than just integer keys:

$array1 = ["a" => 1];
$array2 = ["a" => 2];
$array = [...$array1, ...$array2];
var_dump($array); // [1, 2]

The argument that has been made in favor of this, is that the + operator already provides a pure dictionary merge, array_merge() already provides a hybrid vector/dictionary merge, and this would be an opportunity to make ... a pure vector merge.

There are a two primary reasons why the proposed array_merge() semantics are used instead of either of these alternatives.

Conceptually, the unpacking operator ... can be viewed as equivalent to an array literal into which the unpacked values have been placed:

$array = [...[1, 2, 3], ...[4, 5, 6]];
// Is supposed to behave approximately like:
$array = [1, 2, 3, 4, 5, 6];
 
// Similarly, for string keys:
$array = [...["a" => 1], ...["a" => 2]];
// Is supposed to behave approximately like:
$array = ["a" => 1, "a" => 2];
// Which evaluates to:
$array = ["a" => 2];

While the analogy is not perfect (it breaks down for arrays with explicit integer keys), this does provide a good intuition of the proposed behavior, and also appears to coincide with the general intuition of how such a feature should behave.

The second reason is that since PHP 8.0 the argument unpacking syntax supports string keys and does not ignore them. Instead, string keys are mapped onto named arguments:

call(...["a" => 1]);
// Is equivalent to:
call(a: 1);
// Not:
call(1);

As such, a behavior that simply ignores string keys through renumbering would not be consistent with argument unpacking.

Backward Incompatible Changes

Unpacking of string keys in arrays no longer throws.

Vote

Voting started on 2021-01-25 and ended on 2021-02-08.

Allow string keys in array unpacking?
Real name Yes No
alcaeus (alcaeus)  
as (as)  
asgrim (asgrim)  
ashnazg (ashnazg)  
beberlei (beberlei)  
bishop (bishop)  
bmajdak (bmajdak)  
bwoebi (bwoebi)  
crell (crell)  
cschneid (cschneid)  
derick (derick)  
dharman (dharman)  
duncan3dc (duncan3dc)  
duodraco (duodraco)  
galvao (galvao)  
gasolwu (gasolwu)  
geekcom (geekcom)  
girgias (girgias)  
heiglandreas (heiglandreas)  
ilutov (ilutov)  
jasny (jasny)  
jhdxr (jhdxr)  
kalle (kalle)  
kguest (kguest)  
klaussilveira (klaussilveira)  
kocsismate (kocsismate)  
krakjoe (krakjoe)  
lcobucci (lcobucci)  
marandall (marandall)  
mariano (mariano)  
mauricio (mauricio)  
mbeccati (mbeccati)  
mcmic (mcmic)  
narf (narf)  
nicolasgrekas (nicolasgrekas)  
nikic (nikic)  
ocramius (ocramius)  
petk (petk)  
ramsey (ramsey)  
reywob (reywob)  
sebastian (sebastian)  
sergey (sergey)  
sirsnyder (sirsnyder)  
svpernova09 (svpernova09)  
tandre (tandre)  
thekid (thekid)  
theodorejb (theodorejb)  
trowski (trowski)  
villfa (villfa)  
wyrihaximus (wyrihaximus)  
Final result: 50 0
This poll has been closed.
rfc/array_unpacking_string_keys.txt · Last modified: 2021/02/09 09:55 by nikic