Operator overloading in PHP is currently supported within the engine for C code, and has well engineered designs built into the structures for objects and classes to facilitate it. These structures are currently unavailable outside of extensions or core, and this RFC aims to provide that feature to PHP developers directly for a defined set of operators.
PHP allows classes written in C to implement operator overloading. This is already used in the DateTime library for easy comparison of DateTime objects:
function checkDateIsOverOneHourInFuture(DateTimeInterface $date) { $one_hour_in_future = new DateTime("+1 hour"); if ($date <= $one_hour_in_future) { return false; } return true; }
Operator overloading is currently used by GMP to allow easier to read code. These two pieces of code are equivalent:
// without operator overloading $result = gmp_mod( gmp_add( gmp_mul($c0, gmp_mul($ms0, gmp_invert($ms0, $n0))), gmp_add( gmp_mul($c1, gmp_mul($ms1, gmp_invert($ms1, $n1))), gmp_mul($c2, gmp_mul($ms2, gmp_invert($ms2, $n2))) ) ), gmp_mul($n0, gmp_mul($n1, $n2)) );
// with operator overloading $result = ( $c0 * $ms0 * gmp_invert($ms0, $n0) + $c1 * $ms1 * gmp_invert($ms1, $n1) + $c2 * $ms2 * gmp_invert($ms2, $n2) ) % ($n0 * $n1 * $n2);
Even without understanding what the above code does (it's an excerpt from a Coppersmith attack on RSA), it should be obvious that the second code is a lot clearer. It makes the structure of the code immediately clear (three multiplications are summed up and the modulus is taken), whereas the function-based code actively hides any structure in the code. For mathematical operations infix notation just comes a lot more naturally.
This RFC proposes allowing classes written in PHP (aka userland) to implement operator overloading, for a limited set of operators, similar to how it is allowed for classes written in C.
This RFC uses a new keyword operator
instead of using function
with something like magic methods. This keyword is only valid inside of a class definition, and will result in a parser error otherwise. Internally, the function pointers are stored on the class entry as magic methods, however using a different keyword has multiple advantages:
While the common names of the operators are consistent with math for real numbers, the same operators may not have the same name or even the same meaning in different domains. Using * will mean 'multiplication' in many contexts, but there are domains such as linear algebra where this may not be a 'mul' operation at all.
By representing the implementations by the symbols themselves, this RFC avoids forcing implementations to be mislabeled with words or names which do not match the semantic meaning of that symbol in the program context.
class Collection { // This function unions, it does not add function __add(Collection $other, OperandPosition $operandPos) {} // Now the implementation doesn't suggest something incorrect operator +(Collection $other, OperandPosition $operandPos) {} }
Updating functions to allow symbols as function names could be done, however it would be difficult with the function
keyword to allow it for operator overloads without allowing it in any arbitrary function. Further, supporting the “bitwise and” operator (&) would be particularly difficult and inefficient, as the parser would need to disambiguate whether this was intended as the function name or to mark the function as returning by-reference.
Operator implementations all share the same set of restrictions on method modifiers, such as visibility or marking them as static. Using a different keyword makes it less confusing to PHP developers that these particular method implementations have specific restrictions on which modifiers can be used. It also allows future changes to the method modifiers for operators independently of class methods.
For example, suppose in the future it was desired to allow operators to be “greedy” and fire their implementation first from the left or right operand, instead of the left operand always having precedence. We could add a greedy
modifier that only affects operators and is not available for functions.
class Number { public greedy operator +(Number $other, OperandPosition $operandPos): Number {} }
This is given as an easy to understand example, not intended future scope.
The method implementations for operator overloads will change the behavior of the class as it interacts with the engine. Separating these visually with a different keyword makes it easier for developers to see visually how an object will behave.
Though it is not part of this RFC, one of the considerations made was whether this implementation would restrict future scope of operator overloads. This RFC allows for a defined set of operators, but it could be possible in the future that support for arbitrary infixes is desired. In such a situation, using magic methods would not be desired, as any combination of symbols may be used for the new infix. The restrictions on function names, such as needing to reserve the & to mark a function as being by-reference, would place limitations on such future scope.
This is avoided by allowing the restrictions on operator names to be separated from the restrictions on function names.
Operand implementations can be called on an instance of an object the way normal methods can.
// These all will work normally $op = '+'; $callable = [$obj, '+']; // Calls on the object variable $obj->{'+'}(1, OperandPosition::LeftSide); $obj->$op(1, OperandPosition::LeftSide); $callable(1, OperandPosition::LeftSide); // Calls using call_user_func call_user_func([$obj, '+'], 1, OperandPosition::LeftSide); call_user_func($callable, 1, OperandPosition::LeftSide); // This will error since + is not static call_user_func('ObjClass::+', 1, OperandPosition::LeftSide);
They can be also be directly invoked with a Closure however. This fully supports Reflection, and allows direct calls.
// Manually creating a closure allows a direct function call $closure = Closure::fromCallable([$obj, '+']); $closure(1, OperandPosition::LeftSide); // You can also retrieve the closure through Reflection $closure = (new ReflectionMethod($obj, '+'))->getClosure($obj); $closure(1, OperandPosition::LeftSide); $closure = (new ReflectionObject($obj))->getOperator('+')->getClosure($obj); $closure(1, OperandPosition::LeftSide);
A new throwable, InvalidOperatorError
, which extends TypeError
is provided. This error is thrown when no operator overload can be found for an object which is used with one of the operators supported by this RFC.
$obj = new stdClass(); $value = $obj + 3; // Uncaught InvalidOperatorError: Operator '+' unsupported by class stdClass
Currently, this throws a TypeError.
As detailed later in the RFC, letting the operator overload know which operand it is, the left operand or right operand, is critical for intelligent commutativity breaking. This is a requirement for operands such a division. To enable this a new enum has been provided, OperandPosition
.
enum OperandPosition { case LeftSide; case RightSide; }
For most operators, the methods have the form operator op($other, OperandPosition $operandPos)
and so an implementation to support the multiply (*) operator might look like:
class ComplexNumber { public function __construct(public float $realPart, public float $imaginaryPart) {} operator *(int|float|ComplexNumber $other, OperandPosition $operandPos): ComplexNumber { if ($other instanceof ComplexNumber) { $newRealPart = ($this->realPart * $other->realPart) + ($this->imaginaryPart * $other->imaginaryPart * -1); $newImaginaryPart = ($this->realPart * $other->imaginaryPart) + ($this->imaginaryPart * $other->realPart); } else { $newRealPart = $this->realPart * $other; $newImaginaryPart = $this->imaginaryPart * $other; } return new ComplexNumber($newRealPart, $newImaginaryPart); } } $cnum1 = new ComplexNumber(1, 2); $cnum2 = new ComplexNumber(3, 4); $cnum3 = $cnum1 * $cnum2; echo $cnum3->realPart.' + '.$cnum3->imaginaryPart.'i'; // Prints: -5 + 10i
The operator implementations are resolved left-to-right. If the left operand doesn't support the operation (i.e. it doesn't implement the relevant operator method) then the engine will retry the operation using the right operand.
<?php class Number { function __construct(int $value) {} operator +(Number|int $other, OperandPosition $operandPos): Number { if (is_int($other)) { return new Number($this->value + $other); } else { return new Number($this->value + $other->value); } } } $num = new Number(5); $val1 = $num + 1; // this is equivalent to // +(1, OperandPosition::LeftSide) $val2 = 1 + $num; // this is equivalent to // +(1, OperandPosition::RightSide)
If the called object is the left operand, then $operandPos is OperandPosition::LeftSide
. If the called object is the right operand, then $operandPos is OperandPosition::RightSide
.
class Number { public __construct(readonly public float $value) {} public operator /(int|float $other, OperandPosition $operandPos): Number { if ($operandPos == OperandPosition::LeftSide) { $numerator = $this->value; $denominator = $other; } else { $numerator = $other; $denominator = $this->value; } if ($denominator == 0) { throw new Exception("Cannot divide by zero."); } return new Number($numerator / $denominator); } }
If the left operand produces a TypeError
due to the parameter types listed in the implementation, the operation is not retried with the right operand and the error is instead returned immediately. This is to help developers encounter errors in their program logic as early as possible.
This RFC proposes only a subset of the operators in PHP be supported for operator overloading. See also the sections below on the excluded logic and equality operators, and the implied operators.
The list of supported operations and their signatures are:
Operator | Signature |
---|---|
+ | operator +($other, OperandPosition $operandPos): mixed |
- | operator -($other, OperandPosition $operandPos): mixed |
* | operator *($other, OperandPosition $operandPos): mixed |
/ | operator /($other, OperandPosition $operandPos): mixed |
% | operator %($other, OperandPosition $operandPos): mixed |
** | operator **($other, OperandPosition $operandPos): mixed |
& | operator &($other, OperandPosition $operandPos): mixed |
| | operator |($other, OperandPosition $operandPos): mixed |
^ | operator ^($other, OperandPosition $operandPos): mixed |
~ | operator ~(): mixed |
<< | operator <<($other, OperandPosition $operandPos): mixed |
>> | operator >>($other, OperandPosition $operandPos): mixed |
== | operator ==($other): bool |
<=> | operator <=>($other): int |
The operator methods can be implemented with parameter and return types to narrow the type accepted e.g.
class BigNumber { readonly public string $value; public function __construct(int|float|string $value) { if (is_numeric($value)) { $this->value = (string)$value; } else { throw new Exception("Only initialize BigNumber with numeric values"); } } public operator +(int|float|string|BigNumber $other, OperandPosition $operandPos): self { if ($other instanceof BigNumber) { $otherVal = $other->value; } elseif (is_numeric($other)) { $otherVal = (string)$other; } else { throw new InvalidOperatorError("Only add BigNumber to numeric strings"); } return new BigNumber(bc_add($this->value, $otherVal)); } }
Many expressions in PHP can be reduced to simpler forms.
Operator | Implied As |
---|---|
$a += $b | $a = $a + $b |
$a -= $b | $a = $a - $b |
$a *= $b | $a = $a * $b |
$a /= $b | $a = $a / $b |
$a %= $b | $a = $a % $b |
$a **= $b | $a = $a ** $b |
$a &= $b | $a = $a & $b |
$a |= $b | $a = $a | $b |
$a ^= $b | $a = $a ^ $b |
$a <<= $b | $a = $a << $b |
$a >>= $b | $a = $a >> $b |
$a != $b | !($a == $b) |
$a < $b | ($a <=> $b) == -1 |
$a <= $b | ($a <=> $b) < 1 |
$a > $b | ($a <=> $b) == 1 |
$a >= $b | ($a <=> $b) > -1 |
++$a | $a = $a + 1 |
$a++ | $a = $a + 1 |
--$a | $a = $a - 1 |
$a-- | $a = $a - 1 |
-$a | $a = -1 * $a |
All those expressions that are reduced to implied forms will work, but don't have individual operator overloads. This ensures consistency with the expected behavior of the various operators and reduces backwards compatibility concerns.
Much like how enums are “classes”, operators are “methods”. However, there are several method modifiers that can be used with functions that cannot be used with operators. This is because the modifiers in question do not make sense, or suggest behavior that isn't true.
The following modifiers cannot be used with the operator
keyword, and will produce a CompileError
.
protected
private
static
The following modifiers can be used and behave as expected.
public
abstract
final
Since the implementations for a given operator will always be called when an operator is encountered, regardless of visibility, no visibility level other than public makes sense. Because of this, visibility with the operator
keyword can always be omitted.
Most of the operators follow the form operator op($other, OperandPosition $operandPos): mixed
and those all behave in the same way. There are a few operators that will have a different signature, and/or behave differently.
The bitwise not operator is unary (acts on a single operand) and so has the signature operator ~(): mixed
expecting no parameters.
Because comparisons have a reflection relationship instead of a commutative one, the $operandPos argument is omitted as it could only be used for evil (making $obj == 5
have a different result than 5 == $obj
).
Equality and comparison operators do not throw the InvalidOperatorError
when unimplemented. Instead, the PHP engine falls back to existing comparison logic in the absence of an override for a given class.
The signature for the equals operator has the additional restriction of returning bool
instead of mixed
.
The comparison operator requires an int
return rather than mixed
:
operator <=>(mixed $other): int
Any return value larger than 0 will be normalized to 1, and any return value smaller than 0 will be normalized to -1.
The $operandPos argument is omitted as it could only be used for evil e.g. implementing different comparison logic depending on which side it's on. Instead of passing $operandPos the engine will multiply the result of the call by (-1) where appropriate:
class Number { function __construct(readonly int $value) {} operator <=>(Number|int $other): int { if ($other instanceof Number) { return $this->value <=> $other->value; } return $this->value <=> $other; } } $obj = new Number(5); $less_than = $obj < 5; // is equivalent to // (<=>(5) === -1); $greater_than = 5 > $obj; // is equivalent to // ( (<=>(5) * - 1) === -1 );
By doing * -1, no matter what the implementation is from the PHP code, the >, >=, ==, <=, < comparisons are all guaranteed to be consistent regardless of whether the operand is on the left or right side. This avoids situations where both $obj1 > 5 and 5 > $obj1 can return true.
Comparison operators do not throw the InvalidOperatorError
when unimplemented. Instead, the PHP engine falls back to existing comparison logic in the absence of an override for a given class.
This means that if the equals operator is unimplemented but comparison operator is implemented, the expression $obj == 6
would be evaluated as $obj <=> 6 === 0
.
Because of this, the comparison operator by itself can provide support for all inequality and equality comparisons if the values of the object support that kind of logic. For example, a Number class that represents an arbitrary precision real number could safely implement only the comparison operator.
However, objects which can be equal to another value but which cannot be compared for sorting/ordering could implement only the equals operator. Examples of objects such as this would be ones that represent complex numbers, or objects which represent non-numeric values such as collections.
The definitions for operators are required to explicitly type their parameters. That is, an omitted type is not assumed to be mixed. This is done because nearly every legitimate use of this feature would have a limitation about which types the operation can be combined with. As the engine cannot know what these limitations are, the PHP developer must provide the list of accepted types as part of the signature. This helps PHP developers avoid errors that may be deferred or difficult to track down.
class Matrix { operator +($other, OperandPosition $operandPos): Matrix {} } // Fatal Error: Matrix::+(): Parameter #1 ($other) must explicitly define a type
The mixed
type can still be used for the $other
parameter, but it must do so by explicitly typing it as such.
A new target Attribute::TARGET_OPERATOR
is added to allow attributes to specifically target operator implementations.
Several changes to reflection must be made to support this feature.
These methods need to be updated to ignore the operator methods. Since these are stored internally like any other function on the class entry, they need to be filtered from the results.
The reason for removing the operators from this result is because the operator methods are not callable with string literals on the object. Since they cannot be called like a method is, they should not be returned with the other methods on a class.
These methods must be added to interact with the function handlers for the operator implementations. They will act as an inverse to the changes above.
Operator methods will be represented by an instance of ReflectionMethod
, since in most respects they can be treated like a normal method for the purposes of reflection.
Returns true if the method being reflected uses the operator
keyword. Returns false otherwise.
Yes.
It is a common pattern when developers (particularly juniors who have not learnt an appropriate level of fear yet) learn about a new feature, they will use it in ways that more senior/experienced developers would consider as 'bad'.
Most of the argument against operator overloading (https://james-iry.blogspot.com/2009/03/operator-overloading-ad-absurdum.html) boils down to:
The problem is abuse. Somebody will name something '+' when it has nothing to do with the common notion of '+'. The resulting confusion is a bigger downside than the benefits of allowing programmers to be flexible in naming.
But this is true of any feature in PHP. People are free to write 'getter' methods that mutate an object state rather than just returning a value.
For all valid names X that evoke a common conception, somebody will name something 'X' when it has nothing to do with the common notion of 'X'.
Language design shouldn't focus on preventing people from doing things you disagree with, at the expense of blocking appropriate usage of a feature.
Not all operators are commutative. The most trivial example of this is with subtraction:
class Number { public function __construct(readonly public int|float $value) {} public operator -(int|float $other, OperandPosition $operandPos): Number { if ($operandPos == OperandPosition::LeftSide) { return new Number($this->value - $other); } else { return new Number($other - $this->value); } } }
However, there are more complicated examples with some possible usages. For example, with matrix multiplication:
class Matrix { public function __construct(readonly public array $value) {} public operator *(Matrix $other, OperandPosition $operandPos): Number { if ($operandPos == OperandPosition::LeftSide) { // Count of my columns needs to match // count of $other rows } else { // Count of my rows needs to match // count of $other columns } } }
Interfaces wouldn't help write correct code.
For example, a Vector2d and a Money class could both implement an Addable
interface, but using Addable
as a type check wouldn't prevent users from attempting to add incompatible objects.
interface Addable { operator +(mixed $other, OperandPosition $operandPos): mixed } class Money implements Addable {} class Vector2d implements Addable {} function processValues(Addable $left, Addable $right) { return $left + $right; } processValues( new Money(5, 'USD'), new Vector2d(5, 10) );
Despite both of the parameters to processValues
implementing the Addable
interface, it's not trivial to check if it's possible to actually add the two types together.
Instead users are recommended to use specific types:
class Money { operator +(Money $other, OperandPosition $operandPos): Money {} } $result = new Money(5, 'USD') + new Vector2d(5, 10); // Type error, Vector2d can't be used as Money
This can also be caught by typing the arguments to a helper function:
function processMoneyValues(Money $left, Money $right) { return $left + $right; } processMoneyValues( new Money(5, 'USD'), new Vector2d(5, 10) ); // Type error, Vector2d can't be used as Money
For more details please see this (https://news-web.php.net/php.internals/115719) and this (https://news-web.php.net/php.internals/115752).
The identity operator === is used to check whether two variables contain the same object, or whether two non-objects have the same type and value.
The position of this RFC is that allowing the identity operator to be overloaded isn't a useful thing to do, as it wouldn't enable any new functionality but could introduce potentially terrible bugs in PHP programs where it is used.
The logical operators &&, ||, and, or, and xor refer to a specific kind of math operation, boolean algebra, and their usage should reserved for that purpose only.
Most behavior that PHP developers would want to control with overloads to these operators can in fact be accomplished by allowing an object to control its casting to a boolean value. That is not part of this RFC, but the RFC author views that as a better way to address these operators than allowing arbitrary overloads.
It is possible to provide this feature in PHP without the operator
keyword, but practically speaking this means using names instead of symbols which has a few disadvantages. The details of this are given at the top of this RFC.
Using a new keyword is far more flexible in the future, allowing PHP to grow and adapt with new contributions much easier. This is because future configuration changes to how operator overloads behave can be provided by making modifications to the signature for the operator overload.
Suppose in a few years there was an RFC to allow an operator overload to mark itself to not be inherited. With magic methods, the options would be:
By having a different keyword, this hypothetical RFC could instead be implemented by adding a new modifier, say local
, only for operators. This would result in:
// Parses just fine and implements behavior class SomeClass { public local operator +(mixed $other, OperandPosition $operandPos): SomeClass {} } // Generates an error at parse time class SomeClass { public local function someMethod() {} }
There are other benefits to using a keyword which are detailed in the beginning of this RFC.
Objects used with operators will no longer result in a thrown TypeError
, but instead a thrown InvalidOperatorError
which extends TypeError
.
This change is proposed for PHP 8.2
The introduction of the new keyword operator
will require updates to tooling used by PHP developers. This includes IDEs as well as static analysis tools.
None
Existing extensions can continue to define their own operator overloads by providing a do_operation
call for their classes, however classes which are open to be extended may benefit from being updated so that their overloads can be extended by implementing the necessary methods. In order to accomplish this, the extension class would need to call zend_std_call_op_override
at the start of its do_operation
handler. The operator override handler will not throw the InvalidOperatorError
if the class entry has the type ZEND_INTERNAL_CLASS
. Instead, it will return the value FAILURE
of type zend_result
.
Thus, the following code at the start of any extension's do_operation
handler would be the minimal sufficient addition:
if (zend_std_call_op_override(opcode, &result, &op1, &op2) == SUCCESS) { return SUCCESS; } // The rest of the extension's do_operation handler
To further help extensions support this feature, there are two helper functions:
int has_overload = zend_std_has_op_overload(opcode, &zval); zend_function overload_method = zend_std_get_op_overload(opcode, &ce);
It is safe to pass any zval pointer to zend_std_has_op_overload()
, as it first checks whether or not the Z_TYPE_P(zval) == IS_OBJECT
and returns 0 if it doesn't.
The addition of the opcodes ZEND_IS_LARGER and ZEND_IS_LARGER_OR_EQUAL to the ZEND_AST_BINARY_OP requires several updates to the optimizer and to Opcache. This includes optimizations which are present in JIT for the other binary opcodes.
None
None
Currently the operator overload handler is resolved from left to right, however there are situations involving inheritance where it would be preferable to resolve the handler according to the class inheritance structure. Python's operator overloads behave in this way in fact, where if $a
is a subclass of $b
, the handler for $a
takes precedence regardless of whether it is the right or left operand.
Such handling would involve checking inheritance during the handling of the opline, and is left as future scope.
As mentioned in this RFC, there are some objects within core that implement their own limited operator overloads. Deciding whether to update these objects and open their overloads for extension is left as future scope.
This RFC does not support R-style operator overloading, which allows users to define custom operators outside the symbol set supported by core. Such a feature would be part of a separate RFC. However, the use of the operator
keyword makes this something that could be done seamlessly in future. This is not something that this RFC author intends to do in the future.
This RFC deals with allowing each class to define its own interaction with operators. However, if overloading the operator itself were desired for the entire application, a different approach would be needed. This is also something that the operator
keyword future proofs against, but is not an intended proposal of this RFC author.
Having functions for operators may be beneficial when objects which use operator overloads are used in conjunction with functions like array_reduce
. For example:
array_reduce($arrOfObjs, +(...));
These could be polyfilled in PHP currently:
array_reduce($arrOfObjs, fn ($a, $b) => ($a + $b));
With some additional improvements, it's possible that operator overloads could provide some very useful tools for things such as query builders:
$qb->select(Product::class)->where(Price::class < 50);
Returning an enum for the <=> would be preferable for two reasons.
This is listed as future scope because there is a separate RFC which covers this feature: https://wiki.php.net/rfc/sorting_enum
It is listed as a separate RFC because it is something that could be delivered whether or not this RFC passes.
The draft PR for this RFC can be found here: https://github.com/php/php-src/pull/7388
After the project is implemented, this section should contain
Add limited user-defined operator overloads as described: yes/no. A 2/3 vote is required to pass.
Voting started 2022-01-03 at 00:15 UTC and will end 2022-01-17 at 00:15 UTC.
operator op()
format