rfc:proper-range-semantics

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
rfc:proper-range-semantics [2023/03/22 15:22] – Created girgiasrfc:proper-range-semantics [2023/06/19 13:41] (current) – Implemented girgias
Line 1: Line 1:
 ====== PHP RFC: Define proper semantics for range() function  ====== ====== PHP RFC: Define proper semantics for range() function  ======
  
-  * Version: 0.1+  * Version: 0.3
   * Date: 2023-03-13   * Date: 2023-03-13
   * Author: George Peter Banyard, <girgias@php.net>   * Author: George Peter Banyard, <girgias@php.net>
-  * Status: Under Discussion+  * Status: Implemented (https://github.com/php/php-src/commit/798c40a739e8f1081a516679a367d76c3d0aabb9)
   * Target Version: PHP 8.3   * Target Version: PHP 8.3
   * Implementation: [[https://github.com/php/php-src/pull/10826]]   * Implementation: [[https://github.com/php/php-src/pull/10826]]
Line 11: Line 11:
 ===== Introduction ===== ===== Introduction =====
  
-PHP's standard library implements the <php>range()</php> function, which generates an array of values going from a <php>$start</php> value to an <php>$end</php> value, by default it advances in steps of ''1'' but this behaviour can be changed by passing the <php>$step</php> parameter.+PHP's standard library implements the <php>range()</php> function, which generates an array of values going from a <php>$start</php> value to an <php>$end</php> value
 +By default values are generated by using a step of ''1'' but this behaviour can be changed by passing the <php>$step</php> parameter.
 In principle, the <php>range()</php> function only works with integer, float, and string <php>$start</php> and <php>$end</php> values, but in reality this is not the case. Moreover, even within those expected types the behaviour can be quite strange. In principle, the <php>range()</php> function only works with integer, float, and string <php>$start</php> and <php>$end</php> values, but in reality this is not the case. Moreover, even within those expected types the behaviour can be quite strange.
  
Line 18: Line 19:
 The current behaviour is quite complex, and it might be easier to just read the implementation, but it roughly goes as follows: The current behaviour is quite complex, and it might be easier to just read the implementation, but it roughly goes as follows:
  
-First, check if the <php>$step</php> argument is negativeif it is multiply by ''-1''.+First, check if the <php>$step</php> argument is negativeif it is multiply by ''-1''.
  
 Then check the boundary arguments: Then check the boundary arguments:
  
   * If both start and end values are strings with at least one byte (e.g. <php>range('A', 'Z');</php>, <php>range('AA', 'BB');</php>, or <php>range('15', '25');</php>):   * If both start and end values are strings with at least one byte (e.g. <php>range('A', 'Z');</php>, <php>range('AA', 'BB');</php>, or <php>range('15', '25');</php>):
-    * If one of the inputs is a float numeric string, or the <php>$step</php> parameter is a float: go to the handle float input branch +    * If one of the inputs is a float numeric string, or the <php>$step</php> parameter is a float: go to the handle float input branch. 
-    * If one of the inputs is an integer numeric string: go to the generic handling branch+    * If one of the inputs is an integer numeric string: go to the generic handling branch.
     * Otherwise: discard every byte after the first one and return an array of ASCII characters going from the start ASCII code point to the end ASCII code point.     * Otherwise: discard every byte after the first one and return an array of ASCII characters going from the start ASCII code point to the end ASCII code point.
-  * If one of the start or end value is a float or the <php>$step</php> parameter is a float  (e.g. <php>range(10.5, 12);</php>, <php>range(1, 3, 1.5);</php>, or <php>range(1, 3, 1.0);</php>): cast start and end values to float and return an array of floats +  * If the start or end value is a float or the <php>$step</php> parameter is a float  (e.g. <php>range(10.5, 12);</php>, <php>range(1, 3, 1.5);</php>, or <php>range(1, 3, 1.0);</php>): cast start and end values to float and return an array of floats. 
-  * Otherwise (generic handling): cast start and end values to int and return an array of int.+  * Otherwise (generic handling): cast start and end values to int and return an array of integers.
      
 The generic case will accept //any// type. The generic case will accept //any// type.
Line 132: Line 133:
   string(1) "E"   string(1) "E"
 } }
-</PHP> 
  
-Example showing the ASCII code point range: + 
-<PHP> +var_dump(range('1', '3')); 
-var_dump(range('a', 'Z')); +array(3) {
-/* +
-array(8) {+
   [0]=>   [0]=>
-  string(1) "a"+  int(1)
   [1]=>   [1]=>
-  string(1"`"+  int(2)
   [2]=>   [2]=>
-  string(1) "_" +  int(3)
-  [3]=> +
-  string(1) "^" +
-  [4]=> +
-  string(1) "]" +
-  [5]=> +
-  string(1) "\" +
-  [6]=> +
-  string(1) "[" +
-  [7]=> +
-  string(1"Z"+
 } }
-*/ 
 </PHP> </PHP>
  
Line 176: Line 163:
 </PHP> </PHP>
  
-Example showing how negative steps are multiplied by ''-1'':+Example showing how negative steps for increasing ranges are multiplied by ''-1'':
 <PHP> <PHP>
 var_dump(range(0, 10, -2)); var_dump(range(0, 10, -2));
Line 193: Line 180:
   int(10)   int(10)
 } }
 +</PHP>
 +
 +
 +Example showing the ASCII code point range:
 +<PHP>
 +var_dump( range("!", "/") );
 +/*
 +array(15) {
 +  [0]=>
 +  string(1) "!"
 +  [1]=>
 +  string(1) """
 +  [2]=>
 +  string(1) "#"
 +  [3]=>
 +  string(1) "$"
 +  [4]=>
 +  string(1) "%"
 +  [5]=>
 +  string(1) "&"
 +  [6]=>
 +  string(1) "'"
 +  [7]=>
 +  string(1) "("
 +  [8]=>
 +  string(1) ")"
 +  [9]=>
 +  string(1) "*"
 +  [10]=>
 +  string(1) "+"
 +  [11]=>
 +  string(1) ","
 +  [12]=>
 +  string(1) "-"
 +  [13]=>
 +  string(1) "."
 +  [14]=>
 +  string(1) "/"
 +}
 +*/
 +
 +var_dump(range('a', 'Z'));
 +/*
 +array(8) {
 +  [0]=>
 +  string(1) "a"
 +  [1]=>
 +  string(1) "`"
 +  [2]=>
 +  string(1) "_"
 +  [3]=>
 +  string(1) "^"
 +  [4]=>
 +  string(1) "]"
 +  [5]=>
 +  string(1) "\"
 +  [6]=>
 +  string(1) "["
 +  [7]=>
 +  string(1) "Z"
 +}
 +*/
 </PHP> </PHP>
  
Line 212: Line 261:
 Examples with unexpected types: Examples with unexpected types:
 <PHP> <PHP>
 +/* null */
 +var_dump(range(null, 2));
 +array(3) {
 +  [0]=>
 +  int(0)
 +  [1]=>
 +  int(1)
 +  [2]=>
 +  int(2)
 +}
 +
 +var_dump(range(null, 'e'));
 +array(1) {
 +  [0]=>
 +  int(1)
 +}
 +
 /* Array */ /* Array */
 var_dump(range([5], [8])); var_dump(range([5], [8]));
Line 267: Line 333:
 ==== Issues surrounding usage of INF and NAN values ==== ==== Issues surrounding usage of INF and NAN values ====
  
-Infinite values are handles as part of the range boundary checks, or for the <php>$step</php> parameter when checking that the step is less than the range being requested, and will throw ValueErrors.+Infinite values are handled as part of the range boundary checks, or for the <php>$step</php> parameter when checking that the step is less than the range being requested, and will throw ValueErrors.
  
 However, NAN values are not specifically handled and result in nonsensical ranges: However, NAN values are not specifically handled and result in nonsensical ranges:
Line 284: Line 350:
 </PHP> </PHP>
  
-Where using a NAN values as a step even breaks the expectation that <php>range()</php> will return a non empty list.+Where using a NAN value as a step even breaks the expectation that <php>range()</php> will return a non empty list.
  
 +==== Issues surrounding usage of string digits ====
 +
 +If one of the boundary inputs is a string digit (e.g. ''"1"'') both inputs will be interpreted as numbers.
 +This doesn't pose too much of an issue if both inputs are string digits as it will generate a list of integers.
 +
 +However, if the other input is a non-numeric string the expected behaviour of generating a list of ASCII characters is not upheld anymore:
 +<PHP>
 +var_dump( range("9", "A") );
 +array(10) {
 +  [0]=>
 +  int(9)
 +  [1]=>
 +  int(8)
 +  [2]=>
 +  int(7)
 +  [3]=>
 +  int(6)
 +  [4]=>
 +  int(5)
 +  [5]=>
 +  int(4)
 +  [6]=>
 +  int(3)
 +  [7]=>
 +  int(2)
 +  [8]=>
 +  int(1)
 +  [9]=>
 +  int(0)
 +}
 +</PHP>
 +instead of the expected:
 +<PHP>
 +var_dump( range("9", "A") );
 +array(9) {
 +  [0]=>
 +  string(1) "9"
 +  [1]=>
 +  string(1) ":"
 +  [2]=>
 +  string(1) ";"
 +  [3]=>
 +  string(1) "<"
 +  [4]=>
 +  string(1) "="
 +  [5]=>
 +  string(1) ">"
 +  [6]=>
 +  string(1) "?"
 +  [7]=>
 +  string(1) "@"
 +  [8]=>
 +  string(1) "A"
 +}
 +</PHP>
  
 ===== Proposal ===== ===== Proposal =====
Line 293: Line 414:
 The changes are as follows: The changes are as follows:
  
-  * If <php>$step</php> is a float but is compatible with ''int'' interpret it as an integer. +  * If <php>$step</php> is a float but is compatible with ''int'' (i.e. <php>(float)(int)$step === $step</php>interpret it as an integer. 
-  * Introduce and use a proper ZPP check for ''int|float|string'' <php>$start</php> and <php>$end</php> parametersthis will cause <php>TypeError</php>s to be thrown when passing objects, resources, and arrays to <php>range()</php>. It will also cause a deprecation warning to be emitted when passing ''null''.+  * Introduce and use a proper ZPP check for ''int|float|string'' <php>$start</php> and <php>$end</php> parametersthis will cause <php>TypeError</php>s to be thrown when passing objects, resources, and arrays to <php>range()</php>. It will also cause a deprecation warning to be emitted when passing ''null''.
   * Throw value errors if <php>$start</php>, <php>$end</php>, or <php>$step</php> is a non-finite float (-INF, INF, NAN).   * Throw value errors if <php>$start</php>, <php>$end</php>, or <php>$step</php> is a non-finite float (-INF, INF, NAN).
   * Throw a more descriptive <php>ValueError</php> when <php>$step</php> is zero.   * Throw a more descriptive <php>ValueError</php> when <php>$step</php> is zero.
-  * Emit an <php>E_WARNING</php> when passing a negative <php>$step</php> +  * Throw a <php>ValueError</php> when passing a negative <php>$step</php> for increasing ranges. 
-  * Throw a <php>ValueError</php> when <php>$start</php> or <php>$end</php> is the empty string +  * Emit an <php>E_WARNING</php> when <php>$start</php> or <php>$end</php> is the empty string, and cast the value to ''0'' 
-  * Emit an <php>E_WARNING</php> when <php>$start</php> or <php>$end</php> has more than one byte. +  * Emit an <php>E_WARNING</php> when <php>$start</php> or <php>$end</php> has more than one byte if it is a non-numeric string
-  * Emit an <php>E_WARNING</php> when <php>$start</php> or <php>$end</php> is cast to an integer because the other boundary input is a number or numeric string. (e.g. <php>range('5', 'z');</php> or <php>range(5, 'z');</php>+  * Emit an <php>E_WARNING</php> when <php>$start</php> or <php>$end</php> is cast to an integer because the other boundary input is a number. (e.g. <php>range(5, 'z');</php>
-  * Emit an <php>E_WARNING</php> when <php>$step</php> is a float when trying to generate a range of characters.+  * Produce a list of characters if one of the boundary inputs is a string digit instead of casting the other input to int (e.g. <php>range('5', 'z');</php>
 +  * Emit an <php>E_WARNING</php> when <php>$step</php> is a float when trying to generate a range of characters, except if both boundary inputs are numeric strings (e.g. <php>range('5', '9', 0.5);</php> does not produce a warning).
  
  
-Therefore, the behaviour of some of the previous would result in the following behaviour:+Therefore, the behaviour of some of the previous examples would result in the following behaviour:
  
  
 <PHP> <PHP>
-var_dump(range('', 'Z')); 
-/* 
-Warning: range(): Argument #1 ($start) must not be empty, casted to 0 in %s on line %d 
- 
-Warning: range(): Argument #1 ($start) must be a string if argument #2 ($end) is a string, argument #2 ($end) converted to 0 in %s on line %d 
-array(1) { 
-  [0]=> 
-  int(0) 
-} 
-*/ 
- 
 var_dump(range('A', 'E', 1.0)); var_dump(range('A', 'E', 1.0));
 array(5) { array(5) {
Line 332: Line 443:
   string(1) "E"   string(1) "E"
 } }
 +
 +var_dump( range("9", "A") );
 +array(9) {
 +  [0]=>
 +  string(1) "9"
 +  [1]=>
 +  string(1) ":"
 +  [2]=>
 +  string(1) ";"
 +  [3]=>
 +  string(1) "<"
 +  [4]=>
 +  string(1) "="
 +  [5]=>
 +  string(1) ">"
 +  [6]=>
 +  string(1) "?"
 +  [7]=>
 +  string(1) "@"
 +  [8]=>
 +  string(1) "A"
 +}
 +
 +var_dump(range('', 'Z'));
 +/*
 +Warning: range(): Argument #1 ($start) must not be empty, casted to 0
 +
 +Warning: range(): Argument #1 ($start) must be a string if argument #2 ($end) is a string, argument #2 ($end) converted to 0
 +*/
 +
 +
 +var_dump(range(null, 2));
 +/*
 +Deprecated: range(): Passing null to parameter #1 ($start) of type string|int|float is deprecated
 +array(3) {
 +  [0]=>
 +  int(0)
 +  [1]=>
 +  int(1)
 +  [2]=>
 +  int(2)
 +}
 +*/
 +
 +var_dump(range(null, 'e'));
 +/*
 +Deprecated: range(): Passing null to parameter #1 ($start) of type string|int|float is deprecated in %s on line %d
 +
 +Warning: range(): Argument #1 ($start) must be a string if argument #2 ($end) is a string, argument #2 ($end) converted to 0 in %s on line %d
 +array(1) {
 +  [0]=>
 +  int(1)
 +}
 +*/
 +
 +var_dump(range(0, 10, -2));
 +/*
 +range(): Argument #3 ($step) must be greater than 0 for increasing ranges
 +*/
 </PHP> </PHP>
  
Line 338: Line 508:
 Using Nikita Popov's [[https://github.com/nikic/popular-package-analysis|''popular-package-analysis'']] project and running a [[https://github.com/Girgias/popular-package-analysis/pull/1|rough analysis]] of the usage of <php>range()</php> on the top 1000 composer projects we get that out of around 450 calls to <php>range()</php> Using Nikita Popov's [[https://github.com/nikic/popular-package-analysis|''popular-package-analysis'']] project and running a [[https://github.com/Girgias/popular-package-analysis/pull/1|rough analysis]] of the usage of <php>range()</php> on the top 1000 composer projects we get that out of around 450 calls to <php>range()</php>
  
-  - 154 calls are made out with literal number arguments+  - 154 calls are made with literal number arguments
   - 18 calls are made with literal string arguments   - 18 calls are made with literal string arguments
   - 140 calls have at least one argument be the result of a plus (''+''), minus (''-''), or times (''*'') operation.   - 140 calls have at least one argument be the result of a plus (''+''), minus (''-''), or times (''*'') operation.
Line 346: Line 516:
      
 The calls that are non-trivial were manually checked and seem all valid. The calls that are non-trivial were manually checked and seem all valid.
- 
-Only one example, a test case in Drupal, would have triggered an <php>E_WARNING</php> about using a negative step: 
-<PHP> 
-drupal/core/modules/views/tests/src/Functional/Handler/FieldWebTest.php:102 
-Negative step is pointless 
-range(5, 1, -1) 
-</PHP> 
  
 ===== Backward Incompatible Changes ===== ===== Backward Incompatible Changes =====
Line 358: Line 521:
 <php>TypeError</php>s are thrown for incompatible types. <php>TypeError</php>s are thrown for incompatible types.
  
-<php>ValueError</php>s are thrown for INF, NAN, and empty string values.+<php>ValueError</php>s are thrown for INF, NAN, and negative step values for increasing ranges.
  
 <php>E_WARNING</php>s are emitted for various issues. <php>E_WARNING</php>s are emitted for various issues.
  
-Calls to <php>range()</php> that have integer boundaries but a float step that is compatible as an integer will now return an array of integers instead of an array of float:+Calls to <php>range()</php> that have integer boundaries but a float step that is compatible as an integer will now return an array of integers instead of an array of floats:
 <PHP> <PHP>
 var_dump( range(1, 5, 2.0) ); var_dump( range(1, 5, 2.0) );
Line 393: Line 556:
 As per the voting RFC a yes/no vote with a 2/3 majority is needed for this proposal to be accepted. As per the voting RFC a yes/no vote with a 2/3 majority is needed for this proposal to be accepted.
  
-Voting started on 2023-XX-XX and will end on 2023-XX-XX.+Voting started on 2023-06-01 and will end on 2023-06-15.
 <doodle title="Accept Saner range() semantics RFC?" auth="girgias" voteType="single" closed="true"> <doodle title="Accept Saner range() semantics RFC?" auth="girgias" voteType="single" closed="true">
    * Yes    * Yes
Line 403: Line 566:
 GitHub pull request: https://github.com/php/php-src/pull/10826 GitHub pull request: https://github.com/php/php-src/pull/10826
  
-After the project is implementedthis section should contain +Implemented in PHP 8.3as commit: https://github.com/php/php-src/commit/798c40a739e8f1081a516679a367d76c3d0aabb9
- +
-  * the version(s) it was merged into +
-  * a link to the git commit(s) +
-  * a link to the PHP manual entry for the feature+
  
 ===== References ===== ===== References =====
  
rfc/proper-range-semantics.1679498534.txt.gz · Last modified: 2023/03/22 15:22 by girgias