Table of Contents

PHP RFC: Define proper semantics for range() function

Introduction

PHP's standard library implements the range() function, which generates an array of values going from a $start value to an $end value. By default values are generated by using a step of 1 but this behaviour can be changed by passing the $step parameter. In principle, the range() function only works with integer, float, and string $start and $end values, but in reality this is not the case. Moreover, even within those expected types the behaviour can be quite strange.

Current Behaviour of range()

The current behaviour is quite complex, and it might be easier to just read the implementation, but it roughly goes as follows:

First, check if the $step argument is negative; if it is multiply by -1.

Then check the boundary arguments:

The generic case will accept any type.

Let us look at various examples to highlight the range of behaviour exhibited by range()

Examples

Example with expected values:

var_dump(range(1, 3));
array(3) {
  [0]=>
  int(1)
  [1]=>
  int(2)
  [2]=>
  int(3)
}
 
var_dump(range(1.0, 3.0));
array(3) {
  [0]
  float(1)
  [1]
  float(2)
  [2]
  float(3)
}
 
var_dump(range(1, 3, 1.5));
array(2) {
  [0]
  float(1)
  [1]
  float(2.5)
}
 
 
var_dump(range(1.0, 3.0, 1.5));
array(2) {
  [0]
  float(1)
  [1]
  float(2.5)
}
 
var_dump(range('10', '13'));
array(4) {
  [0]
  int(10)
  [1]
  int(11)
  [2]
  int(12)
  [3]
  int(13)
}
 
var_dump(range('10.0', '13.0'));
array(4) {
  [0]
  float(10)
  [1]
  float(11)
  [2]
  float(12)
  [3]
  float(13)
}
 
var_dump(range('10', '13', 1.5));
array(3) {
  [0]
  float(10)
  [1]
  float(11.5)
  [2]
  float(13)
}
 
var_dump(range('10.0', '13.0', 1.5));
array(3) {
  [0]
  float(10)
  [1]
  float(11.5)
  [2]
  float(13)
}
 
var_dump(range('A', 'E'));
array(5) {
  [0]
  string(1) "A"
  [1]
  string(1) "B"
  [2]
  string(1) "C"
  [3]
  string(1) "D"
  [4]
  string(1) "E"
}
 
 
var_dump(range('1', '3'));
array(3) {
  [0]=>
  int(1)
  [1]=>
  int(2)
  [2]=>
  int(3)
}

Example showing how to produce a decreasing range:

var_dump(range('E', 'A'));
array(5) {
  [0]
  string(1) "E"
  [1]
  string(1) "D"
  [2]
  string(1) "C"
  [3]
  string(1) "B"
  [4]
  string(1) "A"
}

Example showing how negative steps for increasing ranges are multiplied by -1:

var_dump(range(0, 10, -2));
array(6) {
  [0]=>
  int(0)
  [1]=>
  int(2)
  [2]=>
  int(4)
  [3]=>
  int(6)
  [4]=>
  int(8)
  [5]=>
  int(10)
}

Example showing the ASCII code point range:

var_dump( range("!", "/") );
/*
array(15) {
  [0]=>
  string(1) "!"
  [1]=>
  string(1) """
  [2]=>
  string(1) "#"
  [3]=>
  string(1) "$"
  [4]=>
  string(1) "%"
  [5]=>
  string(1) "&"
  [6]=>
  string(1) "'"
  [7]=>
  string(1) "("
  [8]=>
  string(1) ")"
  [9]=>
  string(1) "*"
  [10]=>
  string(1) "+"
  [11]=>
  string(1) ","
  [12]=>
  string(1) "-"
  [13]=>
  string(1) "."
  [14]=>
  string(1) "/"
}
*/
 
var_dump(range('a', 'Z'));
/*
array(8) {
  [0]=>
  string(1) "a"
  [1]=>
  string(1) "`"
  [2]=>
  string(1) "_"
  [3]=>
  string(1) "^"
  [4]=>
  string(1) "]"
  [5]=>
  string(1) "\"
  [6]=>
  string(1) "["
  [7]=>
  string(1) "Z"
}
*/

Example showing how string inputs can get cast to int/float:

var_dump(range('', 'Z'));
array(1) {
  [0]=>
  int(0)
}
 
var_dump(range('A', 'E', 1.0));
array(1) {
  [0]=>
  float(0)
}

Examples with unexpected types:

/* null */
var_dump(range(null, 2));
array(3) {
  [0]=>
  int(0)
  [1]=>
  int(1)
  [2]=>
  int(2)
}
 
var_dump(range(null, 'e'));
array(1) {
  [0]=>
  int(1)
}
 
/* Array */
var_dump(range([5], [8]));
array(1) {
  [0]=>
  int(1)
}
 
/* Resources */
var_dump(range(STDIN, STDERR));
array(3) {
  [0]=>
  int(1)
  [1]=>
  int(2)
  [2]=>
  int(3)
}
 
/* Int/Float castable object */
$o1 = gmp_init(15);
$o2 = gmp_init(20);
var_dump(range($o1, $o2));
array(6) {
  [0]=>
  int(15)
  [1]=>
  int(16)
  [2]=>
  int(17)
  [3]=>
  int(18)
  [4]=>
  int(19)
  [5]=>
  int(20)
}
 
/* Int/Float non-castable object */
$o1 = new stdClass();
$o2 = new stdClass();
var_dump(range($o1, $o2));
/*
 
Warning: Object of class stdClass could not be converted to int in /tmp/preview on line 13
 
Warning: Object of class stdClass could not be converted to int in /tmp/preview on line 13
array(1) {
  [0]=>
  int(1)
}
*/

Issues surrounding usage of INF and NAN values

Infinite values are handled as part of the range boundary checks, or for the $step parameter when checking that the step is less than the range being requested, and will throw ValueErrors.

However, NAN values are not specifically handled and result in nonsensical ranges:

$nan = fdiv(0,0);
 
var_dump(range($nan, 5));
array(1) {
  [0]=>
  float(NAN)
}
 
var_dump(range(1, 5, $nan));
array(0) {
}

Where using a NAN value as a step even breaks the expectation that range() will return a non empty list.

Issues surrounding usage of string digits

If one of the boundary inputs is a string digit (e.g. “1”) both inputs will be interpreted as numbers. This doesn't pose too much of an issue if both inputs are string digits as it will generate a list of integers.

However, if the other input is a non-numeric string the expected behaviour of generating a list of ASCII characters is not upheld anymore:

var_dump( range("9", "A") );
array(10) {
  [0]=>
  int(9)
  [1]=>
  int(8)
  [2]=>
  int(7)
  [3]=>
  int(6)
  [4]=>
  int(5)
  [5]=>
  int(4)
  [6]=>
  int(3)
  [7]=>
  int(2)
  [8]=>
  int(1)
  [9]=>
  int(0)
}

instead of the expected:

var_dump( range("9", "A") );
array(9) {
  [0]=>
  string(1) "9"
  [1]=>
  string(1) ":"
  [2]=>
  string(1) ";"
  [3]=>
  string(1) "<"
  [4]=>
  string(1) "="
  [5]=>
  string(1) ">"
  [6]=>
  string(1) "?"
  [7]=>
  string(1) "@"
  [8]=>
  string(1) "A"
}

Proposal

The proposal is to adjust the semantics of range() in various ways to throw exceptions outright or at least warn when passing unusable arguments to range().

The changes are as follows:

Therefore, the behaviour of some of the previous examples would result in the following behaviour:

var_dump(range('A', 'E', 1.0));
array(5) {
  [0]=>
  string(1) "A"
  [1]=>
  string(1) "B"
  [2]=>
  string(1) "C"
  [3]=>
  string(1) "D"
  [4]=>
  string(1) "E"
}
 
var_dump( range("9", "A") );
array(9) {
  [0]=>
  string(1) "9"
  [1]=>
  string(1) ":"
  [2]=>
  string(1) ";"
  [3]=>
  string(1) "<"
  [4]=>
  string(1) "="
  [5]=>
  string(1) ">"
  [6]=>
  string(1) "?"
  [7]=>
  string(1) "@"
  [8]=>
  string(1) "A"
}
 
var_dump(range('', 'Z'));
/*
Warning: range(): Argument #1 ($start) must not be empty, casted to 0
 
Warning: range(): Argument #1 ($start) must be a string if argument #2 ($end) is a string, argument #2 ($end) converted to 0
*/
 
 
var_dump(range(null, 2));
/*
Deprecated: range(): Passing null to parameter #1 ($start) of type string|int|float is deprecated
array(3) {
  [0]=>
  int(0)
  [1]=>
  int(1)
  [2]=>
  int(2)
}
*/
 
var_dump(range(null, 'e'));
/*
Deprecated: range(): Passing null to parameter #1 ($start) of type string|int|float is deprecated in %s on line %d
 
Warning: range(): Argument #1 ($start) must be a string if argument #2 ($end) is a string, argument #2 ($end) converted to 0 in %s on line %d
array(1) {
  [0]=>
  int(1)
}
*/
 
var_dump(range(0, 10, -2));
/*
range(): Argument #3 ($step) must be greater than 0 for increasing ranges
*/

Impact Analysis

Using Nikita Popov's ''popular-package-analysis'' project and running a rough analysis of the usage of range() on the top 1000 composer projects we get that out of around 450 calls to range()

  1. 154 calls are made with literal number arguments
  2. 18 calls are made with literal string arguments
  3. 140 calls have at least one argument be the result of a plus (+), minus (-), or times (*) operation.
  4. 47 calls have at least one argument be a variable
  5. 25 calls have an argument made from a function that returns a number (count(), min(), or max())
  6. 66 calls have at least an argument that comes from a class property, class method, function, or array dimension.

The calls that are non-trivial were manually checked and seem all valid.

Backward Incompatible Changes

TypeErrors are thrown for incompatible types.

ValueErrors are thrown for INF, NAN, and negative step values for increasing ranges.

E_WARNINGs are emitted for various issues.

Calls to range() that have integer boundaries but a float step that is compatible as an integer will now return an array of integers instead of an array of floats:

var_dump( range(1, 5, 2.0) );
/* New Behaviour */
array(3) {
  [0]=>
  int(1)
  [1]=>
  int(3)
  [2]=>
  int(5)
}
/* Current Behaviour */
array(3) {
  [0]=>
  float(1)
  [1]=>
  float(3)
  [2]=>
  float(5)
}

Proposed PHP Version

Next minor version, i.e. PHP 8.3.0.

Proposed Voting Choices

As per the voting RFC a yes/no vote with a 2/3 majority is needed for this proposal to be accepted.

Voting started on 2023-06-01 and will end on 2023-06-15.

Accept Saner range() semantics RFC?
Real name Yes No
alcaeus (alcaeus)  
bwoebi (bwoebi)  
colinodell (colinodell)  
crell (crell)  
cschneid (cschneid)  
dharman (dharman)  
galvao (galvao)  
girgias (girgias)  
heiglandreas (heiglandreas)  
lufei (lufei)  
mcmic (mcmic)  
nicolasgrekas (nicolasgrekas)  
nielsdos (nielsdos)  
ocramius (ocramius)  
petk (petk)  
pierrick (pierrick)  
santiagolizardo (santiagolizardo)  
sergey (sergey)  
svpernova09 (svpernova09)  
theodorejb (theodorejb)  
Final result: 20 0
This poll has been closed.

Implementation

GitHub pull request: https://github.com/php/php-src/pull/10826

Implemented in PHP 8.3, as commit: https://github.com/php/php-src/commit/798c40a739e8f1081a516679a367d76c3d0aabb9

References