rfc:is_list

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
rfc:is_list [2020/12/20 01:33] tandrerfc:is_list [2021/01/20 23:51] (current) – end vote tandre
Line 1: Line 1:
-====== PHP RFC: Add is_list(mixed $value): bool ====== +====== PHP RFC: Add array_is_list(array $array): bool ====== 
-  * Version: 0.1+  * Version: 0.3
   * Date: 2020-12-19   * Date: 2020-12-19
   * Author: Tyson Andre <tandre@php.net>   * Author: Tyson Andre <tandre@php.net>
-  * Status: Under Discussion+  * Status: Implemented
   * Implementation: https://github.com/php/php-src/pull/6070   * Implementation: https://github.com/php/php-src/pull/6070
   * First Published at: https://wiki.php.net/rfc/is_list   * First Published at: https://wiki.php.net/rfc/is_list
Line 13: Line 13:
 ===== Proposal ===== ===== Proposal =====
  
-Add a new function ''is_list(mixed $value): bool'' that will return true if the type of ''$value'' is ''array'' and the array keys are ''0 .. count($value)-1'' in that order. Otherwise, it returns false.+Add a new function ''array_is_list(array $array): bool'' that will return true if the array keys are ''0 .. count($array)-1'' in that order. For other arrays, it returns false. For non-arrays, it throws a ''TypeError''.
  
 This RFC doesn't change PHP's type system and doesn't add new type hints. This RFC doesn't change PHP's type system and doesn't add new type hints.
Line 20: Line 20:
  
 <code php> <code php>
-function is_list(mixed $value): bool { +function array_is_list(array $array): bool {
-    if (!is_array($value)) { return false; } +
-    +
     $expectedKey = 0;     $expectedKey = 0;
-    foreach ($value as $i => $_) {+    foreach ($array as $i => $_) {
         if ($i !== $expectedKey) { return false; }         if ($i !== $expectedKey) { return false; }
         $expectedKey++;         $expectedKey++;
Line 32: Line 30:
  
 $x = [1 => 'a', 0 => 'b']; $x = [1 => 'a', 0 => 'b'];
-var_export(is_list($x));  // false because keys are out of order+var_export(array_is_list($x));  // false because keys are out of order
 unset($x[1]); unset($x[1]);
-var_export(is_list($x));  // true+var_export(array_is_list($x));  // true
  
 // Pitfalls of simpler polyfills - NAN !== NAN // Pitfalls of simpler polyfills - NAN !== NAN
Line 41: Line 39:
 var_export($x === array_values($x));  // false because NAN !== NAN var_export($x === array_values($x));  // false because NAN !== NAN
 var_export($x);  // array (0 => NAN) var_export($x);  // array (0 => NAN)
-var_export(is_list($x));  // true because keys are consecutive integers starting from 0+var_export(array_is_list($x));  // true because keys are consecutive integers starting from 0 
 + 
 +array_is_list(new stdClass());  // throws a TypeError 
 +array_is_list(null);  // throws a TypeError
 </code> </code>
  
Line 57: Line 58:
  
  
-===== Proposed PHP Version(s) =====+===== Proposed PHP Version =====
 8.1 8.1
  
Line 64: Line 65:
 ==== To Opcache ==== ==== To Opcache ====
  
-Opcache's architecture does not change because the type system is unchanged; optimizations of ''is_list()'' can easily be added or removed.+Opcache's architecture does not change because the type system is unchanged; optimizations of ''array_is_list()'' can easily be added or removed.
  
-In the RFC's implementation, opcache evaluates the call ''is_list(arg)'' to a constant if the argument is a constant value.+In the RFC's implementation, opcache evaluates the call ''array_is_list(arg)'' to a constant if the argument is a constant value and doesn't throw (same mechanism currently used for ''array_keys'', etc.).
  
-Long-term, if this sees wide enough adoption to affect performance on widely used apps or frameworks, opcache's contributors will have the option of adding additional checks to make opcache infer that ''is_list()'' being true implies that the argument is an array, and that the keys of the array are integers.+Long-term, if this sees wide enough adoption to affect performance on widely used apps or frameworks, opcache's contributors will have the option of adding additional checks to make opcache infer that ''array_is_list()'' being true implies that the keys of the array are integers.
  
-(Currently, Opcache only optimizes type checks that are converted to type check opcodes such as ''is_resource()'' and ''is_array()''. Opcache doesn't do anything similar for opcodes that become regular function calls such as ''is_numeric()'', so the implementation for ''is_list()'' included with this RFC does not do this.)+(Currently, Opcache only optimizes type checks that are converted to type check opcodes such as ''is_resource()'' and ''is_array()''. Opcache doesn't do anything similar for opcodes that become regular function calls such as ''is_numeric()'', so the implementation for ''array_is_list()'' included with this RFC does not do this.)
  
-===== Proposed Voting Choices ===== +===== Discussion ===== 
-Yes/No, requiring 2/3 majority+ 
 +==== Possibility of naming conflicts with future vector-like types ==== 
 + 
 +Originally, this was called ''is_list'', but renamed due to the potential of naming conflicts with a potential list type. 
 + 
 +https://externals.io/message/112560#112565 
 + 
 +<blockquote> 
 +If we do eventually end up with list/vec types, would the naming here conflict at all?  Or would it cause confusion and name collision?  (Insert name bikeshedding here.) 
 +</blockquote> 
 + 
 +There's definitely the potential for naming conflicts if the type is called ''list'' 
 +but not if it's called ''vec''/''vector''/''varray'' similar to https://docs.hhvm.com/hack/built-in-types/arrays - I'd strongly prefer the latter if there was a viable implementation and it used sequential memory instead of a linked list. 
 + 
 +If the type is named ''list'' instead of ''vec'' and ends up incompatible with arrays, 
 +there'd need to be an ''is_list_type($val)'' or ''$val is list'' 
 +or some other new type check with a less preferable name. 
 +If it's compatible with arrays/lists 
 +(e.g. only checked during property assignment, passing in arguments, and returning values), then it wouldn't be an issue. 
 + 
 +- ''array_is_list(array $array)'' is consistent with many other ''array_*'' methods, which only accept arrays. 
 +- It is very possible that we may end up using the word ''list'' anyway despite those objections, because it's already a reserved keyword in PHP for unrelated syntax (''list($first, $second) = $values''). Recently added types such as ''object'', ''void'', and ''iterable'' (and scalar types) were added in previous PHP versions despite not being reserved in the past. 
 +- The name ''vector'' may conflict with the php-ds PECL depending on how functionality is implemented. 
 + 
 +Providing objects with APIs similar to the external PECL https://www.php.net/manual/en/class.ds-vector.php and the SPL may be easier to adopt because it can be polyfilled, 
 +but there's the drawback that there aren't the memory savings from copy-on-write and that there's the performance overhead of method calls to offsetGet(), etc. 
 + 
 +As mentioned in [[https://wiki.php.net/rfc/is_list#changes_to_php_s_type_system|Changes to PHP's type system]], I'd expect the addition of a separate/incompatible vector type to be a massive undertaking, and possibly unpopular if it splits the language. 
 +In Hack/HHVM, it was practical for users to adopt because HHVM is bundled with a typechecker that checks that the uses 
 +are correct at compile time - because PHP has no bundled type checker, a new type would potentially cause a lot of unintuitive behaviors. 
 + 
 +Additionally, a name of ''is_list'' may cause confusion with built-in list types such as ''SplDoublyLinkedList''
 + 
 +===== Vote ===== 
 + 
 +Voting started on 2021-01-06 and ended 2021-01-20 
 + 
 +This is a Yes/No vote, requiring 2/3 majority 
 + 
 +<doodle title="Add the function array_is_list(array $array): bool to PHP?" auth="tandre" voteType="single" closed="true"> 
 +   * Yes 
 +   * No 
 +</doodle>
  
 ===== References ===== ===== References =====
Line 79: Line 122:
   * https://externals.io/message/109760 "Any interest in a list type?"   * https://externals.io/message/109760 "Any interest in a list type?"
   * https://externals.io/message/111744 "Request for couple memory optimized array improvements"   * https://externals.io/message/111744 "Request for couple memory optimized array improvements"
 +  * https://github.com/php/php-src/pull/4886 "Add is_list function" (outdated PR)
 +  * https://github.com/php/php-src/pull/6070 "Add is_list function (rebased)" (implementation PR)
 +  * https://externals.io/message/112612 https://externals.io/message/112584 https://externals.io/message/112560 https://externals.io/message/112613 "[RFC] Add is_list(mixed $value): bool to check for list-like arrays"
 +
 ===== Rejected Features ===== ===== Rejected Features =====
 +
 +==== Alternate names ====
 +
 +''is_sequential_array''/''array_is_sequential'' was rejected because ''[2=>'a', 3=>'b']'' is also sequential.
 +
 +''is_zero_indexed_array''/''array_is_zero_indexed'' was rejected because that term is much less commonly used.
  
 ==== Alternate implementations ==== ==== Alternate implementations ====
-Making the signature ''array_is_list(array $value): bool'' was rejected because it would lead to much more verbose code such as ''is_array($value) && array_is_list($value)'' and more frequent TypeErrors for null/false. +The signature ''is_array_and_list(mixed $value): bool'' was considered, but rejected because silently returning false for objects would be surprising, 
-Similar to ''is_numeric()'' and ''is_callable()'', ''is_list()'' returns false instead of throwing an error for types that can't possibly be lists.+and the behavior for future list-like types might be misunderstood (''SplDoublyLinkedList'', ''ArrayObject'', etc.)
  
 This deliberately only returns true for arrays with sequential keys and a start offset of 0. It returns false for ''[1=>'first', 2=>'second']''. This deliberately only returns true for arrays with sequential keys and a start offset of 0. It returns false for ''[1=>'first', 2=>'second']''.
  
-This deliberately always returns false for objects, e.g. ''ArrayObject'' or ''SplFixedArray''.+This deliberately throws a TypeError for non-arrays. 
 + 
 +==== Adding flags to is_array() ==== 
 + 
 +https://externals.io/message/112612#112612 
 +<blockquote> 
 +I actually like the idea of flags added to is_array() for this. 
 + 
 +Something like: 
 + 
 +<code php> 
 +is_array($value, ZERO_INDEXED | ASSOCIATIVE | INTEGER_INDEXED) 
 +</code> 
 + 
 +I’m not suggesting these names; they’re for illustration only. 
 +</blockquote> 
 + 
 +I'm strongly opposed to adding any flags to ''is_array'' - keeping basic type checks simple would help in learning/reading/remembering the language. 
 +The addition of flags has a small impact on performance for calls that aren't unambiguously qualified (especially if using both), and it makes it harder to see issues like 
 +''is_array(really_long_multiline_call(arg1, arg2, ZERO_INDEXED))'' where ZERO_INDEXED is passed to another function instead of is_array.
  
 ==== Changes to PHP's type system ==== ==== Changes to PHP's type system ====
  
-**This RFC does not attempt to change php's type system.** External static analyzers may still benefit from inferring key types from ''is_list()'' conditionals seen in code - ''is_list()'' conditionals would give more accurate information about array keys that can be used to detect issues or avoid false positives. (Phan, Psalm, and PHPStan are all static analyzers that support the unofficial phpdoc type ''list<T>'', which is used for arrays that would satisfy ''is_list()'').+**This RFC does not attempt to change php's type system.** External static analyzers may still benefit from inferring key types from ''array_is_list()'' conditionals seen in code - ''array_is_list()'' conditionals would give more accurate information about array keys that can be used to detect issues or avoid false positives. (Phan, Psalm, and PHPStan are all static analyzers that support the unofficial phpdoc type ''list<T>'', which is used for arrays that would satisfy ''array_is_list()'').
  
-Any attempt to change php's type system would need to deal with references and the global scope - e.g. what would happen if an array was passed to `list &$valbut modified to become a non-list from a different callback or through ''asort()''.+Any attempt to change php's type system would need to deal with references and the global scope - e.g. what would happen if an array was passed to ''list &$val'' but modified to become a non-list from a different callback or through ''asort()''.
  
 Additionally, I'd personally expect that changes to the type system that were backwards incompatible would be possible, but unpopular and difficult to implement. HHVM is a project that was initially compatible with php, but has recently dropped compatibility with PHP. https://docs.hhvm.com/hack/built-in-types/arrays may be of interest to anyone who is interested in ways to migrate to stricter alternatives to php's arrays, but that required an entirely different language mode to use (''<?hh''), which doesn't seem viable for PHP itself (for reasons such as splitting the ecosystem and being incompatible with older php versions). Additionally, I'd personally expect that changes to the type system that were backwards incompatible would be possible, but unpopular and difficult to implement. HHVM is a project that was initially compatible with php, but has recently dropped compatibility with PHP. https://docs.hhvm.com/hack/built-in-types/arrays may be of interest to anyone who is interested in ways to migrate to stricter alternatives to php's arrays, but that required an entirely different language mode to use (''<?hh''), which doesn't seem viable for PHP itself (for reasons such as splitting the ecosystem and being incompatible with older php versions).
Line 128: Line 200:
   * Does it make sense to add them without type enforcement via generics? Lists + Generics would be lovely, but as we've seen Generics are Hard(tm) and Not Imminent(tm). But would adding them now make a generic version harder in the future? (I've no idea.)   * Does it make sense to add them without type enforcement via generics? Lists + Generics would be lovely, but as we've seen Generics are Hard(tm) and Not Imminent(tm). But would adding them now make a generic version harder in the future? (I've no idea.)
   * Besides add/remove/iterate, what other baked-in functionality should they have? Eg, can they be mapped/filtered/reduced? It would really suck to revisit lists and not fix that disconnect in the API. (Insert me talking about comprehensions and stuff here.) Ideally this would happen as part of a larger review of how collections work at various levels, which are currently highly clunky.   * Besides add/remove/iterate, what other baked-in functionality should they have? Eg, can they be mapped/filtered/reduced? It would really suck to revisit lists and not fix that disconnect in the API. (Insert me talking about comprehensions and stuff here.) Ideally this would happen as part of a larger review of how collections work at various levels, which are currently highly clunky.
-  * Those are all solvable problems (and I've likely forgotten several), but they would have to be thought through extensively before an implementation could be viable.+ 
 +Those are all solvable problems (and I've likely forgotten several), but they would have to be thought through extensively before an implementation could be viable.
 </blockquote> </blockquote>
 +
 +===== Changelog ======
 +
 +  * 0.3: Change name and signature from ''is_array_and_list(mixed $value)'' to ''array_is_list(array $array)''
 +  * 0.2: Rename from ''is_list()'' to ''is_array_and_list()'', add references and more rejected features
 +
rfc/is_list.1608427996.txt.gz · Last modified: 2020/12/20 01:33 by tandre