rfc:is_list

This is an old revision of the document!


PHP RFC: Add is_list()

Introduction

PHP's array data type is rare in that it supports both integer and string keys, and that iteration order is guaranteed. While it is possible to efficiently check that something is an array, that array may be an associative array, have missing array offsets, or contain out of order keys. It can be useful to verify that the assumption that array keys are consecutive integers is correct, both for data that is being passed into a module or for data being returned by a module. In serializers, it may also be useful to have an efficient check to distinguish lists from associative arrays - for example, json_encode does this when deciding to serialize a value as [0, 1, 2] instead of {“0”:0,“2”:1,“1”:1} for arrays with different key orders.

Proposal

Add a new function is_list(mixed $value): bool that will return true if the type of $value is mixed and the array keys are 0 .. count($value)-1 in that order. Otherwise, it returns false.

This RFC doesn't change PHP's type system and doesn't add new type hints.

The functionality is equivalent to the below polyfill, but a native implementation can quickly return true by checking HT_IS_PACKED(array) && HT_IS_WITHOUT_HOLES(array), which is what json_encode() already uses.

function is_list(mixed $value): bool {
    if (!is_array($value)) { return false; }
 
    $j = 0;
    foreach ($value as $i => $_) {
        if ($i !== $j) { return false; }
        $j++;
    }
    return true;
}
 
$x = [1 => 'a', 0 => 'b'];
var_export(is_list($x));  // false because keys are out of order
unset($x[1]);
var_export(is_list($x));  // true

Note that there are pitfalls in writing a shorter polyfill. For example, array_values($array) === $array would be false for some ways of building the array [0 => NAN] because NAN !== NAN (Not A Number), and array_keys($array) === range(0, count($array) - 1) is wrong for the empty array.

Example Use Cases

  1. Throwing or warning in a library, framework, or API if the passed in value is not a list with elements in order. For example, array_filter($list) returns a list with gaps in it.
  2. Serializers written in PHP, or other use cases that benefits from validating that data conforms to an expected format.
  3. Warning about the use of named arguments with varargs (function example(...$args) {}) in code that does not expect named arguments.
  4. Having an efficient, correct, and understandable way to check that an array is actually a list that doesn't have the pitfalls mentioned earlier.

Proposed PHP Version(s)

8.1

RFC Impact

To Opcache

Opcache's architecture does not change because the type system is unchanged; optimizations of is_list() can easily be added or removed.

In the RFC's implementation, opcache evaluates the call is_list(arg) to a constant if the argument is a constant value.

Long-term, if this sees wide enough adoption to affect performance on widely used apps or frameworks, opcache's contributors will have the option of adding additional checks to make opcache infer that is_list() being true implies that the argument is an array, and that the keys of the array are integers.

(Currently, Opcache only optimizes type checks that are converted to type check opcodes such as is_resource() and is_array(). Opcache doesn't do anything similar for opcodes that become regular function calls such as is_numeric(), so the implementation for is_list() included with this RFC does not do this.)

Proposed Voting Choices

Yes/No, requiring 2/3 majority

References

Rejected Features

Making the signature array_is_list(array $value): bool was rejected because it would lead to much more verbose code such as is_array($value) && array_is_list($value) and more frequent TypeErrors for null/false. Similar to is_numeric() and is_callable(), is_list() returns false instead of throwing an error for types that can't possibly be lists.

This deliberately only returns true for arrays with sequential keys and a start offset of 0. It returns false for [1=>'first', 2=>'second'].

This does not attempt to change php's type system. Static analyzers may benefit from inferring key types from is_list() conditionals seen in code for more accurate information about array keys. https://docs.hhvm.com/hack/built-in-types/arrays may be of interest to anyone who is interested in stricter alternatives to php's arrays.

rfc/is_list.1608419234.txt.gz · Last modified: 2020/12/19 23:07 by tandre