PHP RFC: List\unique() and Assoc\unique()
array_unique() function allows getting all unique values of a given array. Unfortunately, PHP has multiple
definitions of equality and thus uniqueness. The most obvious one (i.e.
$a === $b) is not supported by
This RFC proposes adding two new functions,
Assoc\unique() as alternatives
using strict equality (
===) semantics, the former discarding and the latter preserving keys.
List\unique([1, 2, 3, 1, '2', 3.0, new Foo, ['bar']]) // > [1, 2, 3, '2', 3.0, Foo, ['bar']] Assoc\unique(['foo' => 'foo', 'bar' => 'bar', 'baz' => 'foo']) // > ['foo' => 'foo', 'bar' => 'bar']
Two new functions are added to PHP:
Assoc\unique(bool $array): array
Both functions return a new array containing unique values of the
will return a list, meaning the array will have continuous keys, starting from 0.
Assoc\unique() will reuse
the original arrays keys instead.
Uniqueness is based on the strict equality operator (
===). Any two values that are strictly equal are
considered duplicates and thus only once added to the resulting array. References are preserved.
Removing duplicates from arrays is a common use case provided by many programming languages. PHPs
has been there for ~23 years.
However, PHP has multiple definitions of equality, four in particular supported by
SORT_STRING- Converts values to strings and compares with
SORT_REGULAR- Compares values directly with
SORT_NUMERIC- Converts values to doubles
SORT_LOCALE_STRING- Converts values to strings and compares with
array_unique() sorts the array to avoid comparing each value with every other value which would
scale badly. For this reason, the second parameter
$flags accepts the same
SORT_* options as
sort() function and friends.
None of these options support arrays and objects, and other primitive types are subject to subtle coercion issues. Additionally, coercion can lead to warnings that are likely undesirable, as the fact that these values are coerced for comparison is an implementation detail.
A common issue with many array functions is that they make no distinction between lists and associative arrays. Thus, it is often unclear whether a functions should discard or preserve keys. This is made evident by how the functions are used in user code, only considering them when some issue arises. This RFC proposes adding two separate functions specifically to force users to make a deliberate choice between the two, rather than doing so only after encountering issues with array keys.
The new functions use a temporary hashmap internally. The array is iterated and each value is added to the hashmap. If
the value has not been added to the hashmap before, it is added to the resulting array. If it has been added, the value is
skipped. These new functions have a time complexity of
O(n log n)
(with the exception of
SORT_STRING which also has
Backward Incompatible Changes
There are no backwards-incompatible changes in this RFC.
Previously, adding a new
ARRAY_UNIQUE_IDENTICAL constant that can be passed to
parameter was discussed. The discussion has revealed that most people would prefer a new function over extending
with a flag that might be more difficult to discover.
Voting opened on xxxx-xx-xx and closes on xxxx-xx-xx.