This is an old revision of the document!
PHP RFC: Strict operators directive
- Version: 0.2
- Date: 2019-05-25
- Author: Arnold Daniels, jasny@php.net
- Status: Under Discussion
- First Published at: http://wiki.php.net/rfc/strict_operators
Introduction
PHP performs implicit type conversion for most operators. The rules of conversion are complex, depending on the operator as well as on the type and value of the operands. This can lead to surprising results, where a statement seemingly contradicts itself. This RFC proposes a new directive strict_operators
, which limits the type juggling done by operators and makes them throw a TypeError
for unsupported types.
Making significant changes to the behavior of operators has significant consequences to backward compatibility. Additionally, there is a significant group of people who are in favor of the current method of type juggling. Following the rationale of PHP RFC: Scalar Type Declarations; an optional directive ensures backward compatibility and allows people to choose the type checking model that suits them best.
Motivating examples
Mixed type comparison
Mathematics states that “if (a > b)
and (b > c)
, then (a > c)
”. This statement can be asserted in PHP;
if (($a > $b) && ($b > $c)) { assert($a > $c); }
This assertion fails when choosing different values of different types
$a = '42'; $b = 10; $c = '9 eur'; if (($a > $b) && ($b > $c)) { assert($a > $c); }
Numeric string comparison
Non-strict comparison uses a “smart” comparison method that treats strings as numbers if they are numeric. The meaning of the operator changes based on the value of both operands.
Using the <=>
operator to order the values of an array can lead to different results based on the initial state of the array.
function sorted(array $arr) { usort($arr, function($x, $y) { return $x <=> $y; }); } sorted(['100', '5 eur', '62']); // ['100', '5 eur', '62'] sorted(['100', '62', '5 eur']); // ['5 eur', '62', '100'] sorted(['62', '100', '5 eur']); // ['62', '100', '5 eur']
Array comparison
Using the >
, >=
, <
, <=
and <=>
operators on arrays or objects that don't have the same keys in the same order gives unexpected results.
In the following example $a
is both greater than and less than $b
$a = ['x' => 1, 'y' => 22]; $b = ['y' => 10, 'x' => 15]; $a > $b; // true $a < $b; // true
The logic of relational operators other than ==
, ===
, !=
and !==
has limited practical use. In case both arrays have the same keys (in the same order), a side-by-side comparison is done. If the size differs, the array with the most elements is always seen as the greatest;
[1] < [50]; // true [1, 1] < [50]; // false
This is not a proper method to compare the size of the array, as two operands of equal size but different values are not equal. Instead, count()
should be used in this case.
If two arrays have the same number of items but not the same keys, the <
, <=
, >
and >
operators will always return false.
[1] < ['bar' => 50]; // false [1] > ['bar' => 50]; // false
In case the two arrays have the same number of items and the same keys but in a different order, an element by element comparison is done. The >
and >=
operator is implemented as the inverse of <
and <=
. This results in walking through the operand that's expected to be the smallest.
$a = ['x' => 1, 'y' => 22]; $b = ['y' => 10, 'x' => 15]; $a > $b; // true $a < $b; // true
In the statement with the >
operator, we walk through the elements of $b
, so first comparing $b['y']
to $a['y']
. In the statement with <
we walk through the elements of $a
, so first comparing $a['x']
to $b['x']
. This results in both statements, while seemingly contracting, to evaluate to true.
Strict vs non-strict comparison of arrays
Strict comparison requires that arrays have keys occurring in the same order, while non-strict comparison allows out-of-order keys.
['a' => 'foo', 'b' => 'bar'] == ['b' => 'bar', 'a' => 0]; // true
To compare the values of two arrays in a strict way, while not concerned about the order requires ordering the array by key.
Type juggling of arithmetic operators
The behavior of arithmetic operators for non-scalar types is inconsistent.
Most arithmetic operations throw an Error
if one of the operands is an array. But modulo and exponentiation operations will cast (to 0 or 1) and thus succeed silently.
Objects and resources are always cast to integers or floats. In case of an object, this results in a notice. For resources, this will succeed silently using the resource id as a number.
$a + 10 | $a * 10 | $a % 10 |
|
---|---|---|---|
array | error | error | - |
object | notice | notice | notice |
resource | - | - | - |
Numeric strings and bitwise operators
Bitwise operators have an alternative operation if both operands are strings. This is regardless of the value of the strings.
If both operands for the &, | and ^ operators are strings, then the operation will be performed on the ASCII values of the characters that make up the strings and the result will be a string. In all other cases, both operands will be converted to integers and the result will be an integer.
Bitwise operators are therefore the only operators that don't treat numeric strings as numbers.
"22" & "12"; // "02" 22 & 12; // 4
Switch control structure
The switch
statement does a non-strict comparison. This can lead to unexpected results;
function match($value) { switch ($value) { case 2: return "double"; break; case 1: echo "single"; break; case 0: echo "none"; break; default: throw new Exception("Unexpected value"); } } match("foo"); // "none"
All combinations
Operators can do any of the following for unsupported operands
- Cast
- silent
- with notice
- with warning
- causing a catchable error (fatal)
- Notice + cast
- Warning + cast
- Throw Error
- No operation
Please take a look at this list of all combinations of operators and operands.
Proposal
By default, all PHP files are in weak type-checking mode for operators. A new declare()
directive is added, strict_operators
, which takes either 1
or 0
. If 1
, strict type-checking mode is used for operators in the the file. If 0
, weak type-checking mode is used.
In strict type-checking mode, operators may cast operands to the expected type. However;
- Typecasting is not based on the type of the other operand
- Typecasting is not based on the value of any of the operands
- Operators will throw a
TypeError
for unsupported types
In case an operator can work with several (or all) types, the operands need to match as no casting will be done by those operators.
The one exception is that widening primitive conversion is allowed for int
to float
. This means that parameters that declare float
can also accept int
.
declare(strict_operators=1); 1.2 + 2; // float(3.2)
In this case, we're passing an int
to a function that accepts float
. The parameter is converted (widened) to float.
Comparison operators
Comparison operators work on all scalar types. The types of both values need to match.
Non-scalar types only support the ==
, ===
, !=
and !==
operators.
"foo" > "bar"; // true "foo" > 10; // TypeError("Type mismatch string and int on greater than (>) operator") "foo" == "bar"; // false "foo" == 10; // TypeError("Type mismatch string and int on equals (==) operator") "foo" == null; // TypeError("Type mismatch string and null on equals (==) operator") true > false; // true true != 0; // TypeError("Type mismatch bool and int on not equals (!=) operator") [10] > []; // TypeError("Unsupported type array on greater than (>) operator") [10] == []; // false
The function of the ===
and !==
operators remains unchanged.
Numeric string comparison
Numeric strings are compared the same way as non-numeric strings. To compare two numeric strings as numbers, they need to be cast to floats.
"120" > "99.9"; // false (float)"120" > (float)"99.9"; // true "120" <=> "99.9"; // -1
Array comparison
Comparing two arrays will never throw a TypeError
.
The difference between using the ==
and ===
operator is the order of the keys of the array. This feature remains unchanged.
Scalar values in the array are compared using both type and value, thus similar to the ===
operator. Objects of the same class will be compared similarly to the `==` operator, while objects of classes are always seen as not equal.
['a' => 'foo', 'b' => 'bar'] == ['b' => 'bar', 'a' => 'foo']; // true ['a' => 'foo', 'b' => 'bar'] === ['b' => 'bar', 'a' => 'foo']; // false ['a' => 'foo', 'b' => 'bar'] == ['b' => 'bar', 'a' => 0]; // false (no type juggling)
Object comparison
Comparing two objects of different classes using the ==
or !=
operator will throw a TypeError
.
class Foo { public $x; public function __construct($x) { $this->x = $x; } } class FooBar extends Foo {} (new Foo(10)) == (new Foo(10)); // true (new Foo(10)) == (new Foo(99)); // false (new Foo(10)) === (new Foo(10)); // false (new Foo(10)) == (new FooBar(11)); // TypeError("Type mismatch Foo object and FooBar object on equals (==) operator") (new Foo(10)) === (new FooBar(11)); // false
Comparing two objects of the same class will with these operators check the properties of the objects. By default, properties are compared in a similar fashion to the ===
operator. If the property of both objects contains arrays or objects of the same class, they're compared as using the ==
operator.
Arithmetic operators
Arithmetic operators will only work with integers and floats. Using operands of any other type will result in a TypeError
.
In strict type-checking mode, the behavior of the operator is not determined by the value of the operands. Thus for any string, including numeric strings, a TypeError
is thrown, so strings need to be explicitly cast.
The +
operator is still available for arrays as union operator, requiring both values to be arrays.
Incrementing/Decrementing operators
The incrementing/decrementing operators will throw a TypeError
when the operand is a boolean, null, array, object or resource. The decrementing operator will also throw a TypeError
if the operand is a string.
The function of these operators for integers and floats remains unchanged.
Incrementing strings
The ++
operator for strings will largely remain unchanged, but it will behave consistently for any string, including numeric string. This means that using ++
on a string will always result in a string.
$a = "00"; ++$a; // "01" ++$a; // "02" ++$a; // "03"
Bitwise Operators
Bitwise operators expect both parameters to be an integer. The &
, |
, ^
and ~
operators also accept strings as operands.
Using strings for >>
or <<
, mixing strings with integers or using any other type will throw a TypeError
.
String Operators
The concatenation operator .
will throw a TypeError
if any of the operands is a boolean, array or resource. It will also throw a TypeError
if the operand is an object that doesn't implement the __toString()
method.
Integers, floats, null and objects (with the toString()
method) are cast to a string.
Logical Operators
The function of logical operators remains unchanged. All operands are cast to booleans.
Switch control structure
When strict-type checking for operators is enabled, the switch
statement will do a comparison similar to a comparison on arrays; Scalar values in the array are compared using both type and value, thus similar to the ===
operator. For arrays, the key order does not matter. Objects of the same class will be compared similarly to the `==` operator, while objects of classes are always seen as not equal. It will never throw a TypeError
.
function match($value) { switch ($value) { case ["foo" => 42, "bar" => 1]: echo "foobar"; break; case null: echo "null"; break; case 0: echo "zero"; break; default: throw new Exception("Unexpected value"); } } match(["bar" => 1, "foo" => 42]); // "foobar" match(0); // "zero" match("foo"); // Exception("Unexpected value")
Backward Incompatible Changes
Since the strict type-checking for operators is off by default and must be explicitly used, it does not break backward-compatibility.
Proposed PHP Version
This is proposed for the next minor version of PHP, currently PHP 7.4.
Discussion Points
Narrowing the scope (strict_comparison)
This directive would only affect comparison operators. The behavior of other directives remains unchanged.
Argument; There's a difference between making a change to solve a clear problem (type-juggled comparison operators), and making a change just to be consistent.
Current position; We should not introduce a new directive for a very narrow solution. The directive is intended to cater to those who dislike how type-juggling for operators currently work.
Widening the scope (strong_types)
This directive would disable type juggling all together. This includes control statements like if
and when
.
Using non-scalar values for array keys would throw a TypeError
rather than issuing a warning. Using floats and booleans like $array[12.34]
also silently cast to an integer, these might also throw a TypeError
.
Argument; Weak dynamic typing is criticized by a majority of software developers. Issues related to weak typing extend beyond operators. PHP would benefit from an option to disable this.
Should == and != always a return a boolean?
We could consider letting ==
and !=
do a type check and always return false
in case the type is different. This would make them work similarly to some other languages like Python.
Current position; PHP already has the ===
and !==
operator to accomplish this and thus can use ==
and !=
for strict type comparison. Using !=
and ==
adds the check that you have the expected types and you don't get an unexpected (silent) false
on statements like 10 == “10”
.
Can relation operators be allowed for arrays?
There are cases where using <
or >>
on two arrays can be useful. Especially if both arrays have the same keys in the same order. We might allow these operators to be used in those cases.
Current position; The RFC is about limiting the logic around operand types. An operator should either support the types or not and should not throw an error based on the values of the operands. Arrays don't yield a logical result for the >
and <
operator, except if... We should not add rules for such exceptions.
Can we use the existing strict_types directive?
Rather than using a new directive, we could also use strict_types
with different values, e.g. declare(strict_types=functions,operators)
or declare(strict_types=2)
.
Switch statement should not be affected
The switch
statement does a non-strict comparison operation in a way that should be eliminated when using this directive. On the other hand switch
is not an operator.
Functions that do weak comparison
Functions like in_array
apply weak comparison (by default). Should these functions also be affected by this directive?
Unaffected PHP Functionality
This RFC
- does not affect any functionality concerning explicit type casting.
- does not affect variable casting that occurs in (double-quoted) strings.
- is largely unaffected by other proposals like PHP RFC: Saner string to number comparisons that focus on improving type juggling at the cost of breaking BC.
Propoped Implementation
- add a flag to
CG(active_op_array)->fn_flags
(similar to ''strict_types''). - split function `get_binary_op` into `get_binary_op_standard` and a new function
get_binary_op_strict
, whereget_binary_op
calls either based on the op flag. - add new functions for strict operators to zend_operators.c
As proof of concept, I've created a test where the `strict_types` directive affects the `==` and `!=` operators, making them do an 'identical', resp 'not identical' operation. To test it, build the strict_types-affect-operators-test
branch of the jasny fork and run a test script.
Proposed Voting Choices
Vote will require 2/3 majority.