The discussion around user-defined comparison dates back to 2010 when the Comparable RFC was introduced for the first time and later updated in 2015.
This RFC builds on the motivation of previous discussions with a new proposal for the ability to override the default behaviour of object comparison that was introduced in PHP 5.
For reference, some major languages that also support user-defined comparison:
Currently, objects are compared by recursively comparing their properties in the order that they are defined, which includes public, protected, and private properties. Object comparison does not differentiate between equality and ordering, so the same internal function is called for all comparison operators regardless of the context in which they are used.
Comparing for equality uses the internal equivalent of ==
to compare each property, stopping at the first unequal property found. If two objects are not instances of the same class, equality is immediately FALSE.
Comparing by natural ordering (<
, >
, <=
, >=
, <=>
) stops and returns at the first unequal property found. If two objects are not instances of the same class, the comparison returns 1 when using <=>
, or FALSE otherwise. While this is not necessarily a problem that PHP developers encounter frequently, it is still a logical inconsistency, for example:
new stdClass() < new DateTime(); // false new DateTime() < new stdClass(); // false new stdClass() > new DateTime(); // false new DateTime() > new stdClass(); // false new stdClass() == new DateTime(); // false new DateTime() == new stdClass(); // false new stdClass() <=> new DateTime(); // 1 new DateTime() <=> new stdClass(); // 1
There are some problems with the existing behaviour:
==
to compare properties, which was more appropriate in PHP 5 than it is now, since PHP has become increasingly type-safe since PHP 7.
This RFC introduces two new magic methods: __compareTo
and __equals
.
These methods can be used to define equality and natural ordering of objects, which will override the default behaviour when compared to other values, including scalars. They are both optional and may be implemented without the other.
class Example { /** * Returns: < 0 if $this is less than $other, * > 0 if $this is greater than $other, * == 0 if $this is equal to $other. */ public function __compareTo($other): int { ... } /** * @returns bool TRUE if $this is equal to $other, FALSE otherwise. */ public function __equals($other): bool { ... } }
Note: Typehints are optional.
When an object is compared in a context that considers natural ordering, such as sort
, the __compareTo method is called to determine the object's ordering relative to another value. Objects that do not implement __compareTo will continue to use the existing rules.
This method should return a value < 0 if smaller, 0 if equal, and > 0 if greater.
A class may explicitly disallow ordering by throwing an exception in __compareTo.
An implementor of __compareTo
should ensure that:
$x->__compareTo($y)
is inversely equivalent to $y->__compareTo($x)
.$x->__compareTo($y)
should throw an exception if $y->__compareTo($x)
does.$x->__compareTo($y) > 0 && $y->__compareTo($z) > 0
requires that $x->__compareTo($z) > 0
.$x->__compareTo($y) < 0 && $y->__compareTo($z) < 0
requires that $x->__compareTo($z) < 0
.$x->__compareTo($y) == 0 && $y->__compareTo($z) == 0
requires that $x->__compareTo($z) == 0
.While this might seem like a lot of responsibility and complexity, these requirements are easily met if the implementor applies the same logic sensibly and consistently.
When an object is compared in a context that considers equality, such as in_array
, the __equals method is called to determine if the object is equal to another value. It should return TRUE if equal, or FALSE otherwise. Objects that do not implement __equals will continue to use the existing rules, unless __compareTo is defined, in which case equal ordering implies equal value. This aligns with the current rules, where ordering and equality are based on the same logic. This method is therefore only necessary when equal ordering does not imply equal value.
An implementor of __equals
should ensure that:
$x->__equals($y)
is equal to $y->__equals($x)
.$x->__equals($y)
should throw an exception if $y->__equals($x)
does.$x->__equals($y) && $y->__equals($z)
requires that $x->__equals($z)
be TRUE.Using either or both of these methods covers three common use cases:
Note: Objects that only implement __compareTo will be equal to any value for which __compareTo returns a value equal to 0, including NULL
. This is mostly consistent with existing behaviour:
new stdClass() == 1; // true, with a warning about integer conversion
Expression | Attempted calling order |
---|---|
$a < $b ,$a > $b ,$a <= $b ,$a >= $b ,$a <=> $b | $a->__compareTo($b) ,$b->__compareTo($a) |
$a == $b ,$a != $b | $a->__equals($b) ,$b->__equals($a) ,$a->__compareTo($b) ,$b->__compareTo($a) |
These methods also overload operators:
Operator | Primary | Secondary | |
---|---|---|---|
< | __compareTo | TRUE if __compareTo returns < 0 | |
> | __compareTo | TRUE if __compareTo returns > 0 | |
<= | __compareTo | TRUE if __compareTo returns <= 0 | |
>= | __compareTo | TRUE if __compareTo returns >= 0 | |
== | __equals | __compareTo | TRUE if __equals is not implemented and __compareTo returns 0 |
!= | __equals | __compareTo | TRUE if __equals is not implemented and __compareTo returns non-zero |
=== | No change | ||
!== | No change |
Let's consider a simple Fraction, with a numerator and a denominator. For the sake of this example, fractions are not simplified and are equal if they have the same numerical value.
class Fraction { private $num; private $den; public function __construct(int $num, int $den) { $this->num = $den < 0 ? $num * -1 : $num; $this->den = $den < 0 ? $den * -1 : $den; // Should guard against 0 here. } /** * @return bool Whether this fraction is equal to $other. */ public function __equals($other): bool { if ($other instanceof Fraction) { return $this->num * $other->den == $other->num * $this->den; } return is_numeric($other) && $this->num == $other * $this->den; } /** * @return Natural ordering of this fraction relative to $other. */ public function __compareTo($other): int { if ($other instanceof Fraction) { return $this->num * $other->den <=> $other->num * $this->den; } if (!is_numeric($other)) { throw new DomainException( "Natural ordering relative to non-numeric values is not defined"); } return $this->num <=> $other * $this->den; } } $a = new Fraction(5, 2); // 2.5 $b = new Fraction(10, 4); // 2.5 var_dump($a == $b, $a < $b); /** * The current behaviour will output: * * bool(false) * bool(true) * * The implementation of this RFC would output: * * bool(true) * bool(false) */
The output is different because the default behaviour will compare the object's properties without knowing that there's a relationship between the numerator and the denominator. The first instance has a numerator of 5, so it will be less than the second, which has a numerator of 10. It will also therefore not be equal, even though both fractions represent the same numeric value.
In this example, we support comparing for equality against any other object, regardless of type. This is because all objects can be compared for equality, ie. we can say that an apple does not equal an orange. However, we are throwing an exception if ordering is attempted against a non-numeric value, because the behaviour is undefined: should an apple come before or after an orange? ¯\_(ツ)_/¯
We could also return 0
for undefined cases, which would be okay here because we are also defining equality. A return value of 0
would otherwise imply “equals”, so implementations should either throw for undefined behaviour in __compareTo
or return 0
alongside __equals
.
This is not good practice though, and will not be recommended in the documentation, because it may lead to definitions that are not commutative. For example, let's assume that a Fraction
throws an exception for undefined natural ordering, and a Decimal
returns 0
. $fraction < $decimal
will throw an exception, but $decimal > $fraction
will be FALSE. In fact, returning 0
in __compareTo
for undefined natural ordering leads to all kinds of strange behaviour:
new Decimal('1.0') < new Fraction(1, 1) // false new Decimal('1.0') > new Fraction(1, 1) // false new Decimal('1.0') == new Fraction(1, 1) // false new Decimal('1.0') <=> new Fraction(1, 1) // 0
The documentation will therefore suggest that a DomainException
be thrown in __compareTo
when natural ordering is not defined, and return FALSE in __equals
for the same case. The example implementation of Fraction follows this suggestion.
It's safe to say that user-defined objects are not frequently compared in PHP today. This might be because the user can't control the behaviour, or because use cases are not common. Most objects won't concern themselves with comparability and won't implement these methods, but those that do will find a lot of value in the ability to control the behaviour.
Cases that benefit from the ability to override comparison behaviour:
"1"
and 1
.Decimal(1.00) <=> true; // 0
.There are a few user implementations out there, but there is no way for these to affect the internals of array functions and operators. The only way that we can dictate a standard of behaviour that affects internals is by implementing this at a language level.
This is not an easy decision to make because there are strong cases for both. So in order to make a good decision we have to consider why interfaces exist and what benefits they might have over magic methods for this feature.
Interfaces are used for two main reasons: enforcing a contract, and marking. We should use interfaces here if we want to enforce a specific signature for the comparison methods, but marking does not apply. This is because all objects in PHP are already both comparable and equatable. Marking a class as “Comparable” does not add any meaningful information because it would be comparable regardless. However, this might not always be the case: if PHP deprecates the default comparison behaviour in the future, this case falls through because not all objects are comparable anymore. There are other ways to determine whether a class is comparable or equatable, perhaps with a function like is_comparable
or an internal interface like Traversable
.
We don't have interfaces for __get
and __set
because all objects
already support property access. ArrayAccess
is a necessary interface because not
all objects support array syntax by default. Comparison is already supported by all objects.
With this in mind, the only consideration remaining is whether we want to enforce method signatures for comparison methods. Because classes in PHP do not extend a root class, base classes have the freedom to typehint magic methods however they want (as long as they are public and non-static). This is useful because you can restrict comparison to a specific type to guard against unintended behaviour:
class Fraction { ... /** * This will fail when attempting to compare to anything other than a Fraction. */ public function __equals(Fraction $other): bool { return $this->num * $other->den === $other->num * $this->den; } }
Something that has come up in discussion is that “interfaces feel less magical” and that “PHP doesn't need more magic”. It's not my intention to discredit those that share these opinions, but there would technically not be any less magic if we decided to use interfaces instead. There would still be the same effects on internal functions and operators. If instead of “magic methods” we thought of them as “default methods”, it would be equivalent to PHP classes having a root class that has a default implementation of all the magic methods. From this perspective, all objects already implement both __equals
and __compareTo
, which is similar to Java because all classes automatically have an equals method.
PHP already uses magic methods to change default internal behaviour, so using magic methods here would not be something entirely unexpected. Opting for magic methods also removes the possibility of existing classes unintentionally affecting comparability by already having an equals or compareTo method. Method names that start with “__” are reserved, so there is no risk of this if we use magic methods.
If a class does not implement __compareTo
but __equals
returns TRUE,
should that be equivalent to __compareTo
returning 0?
This comes down to whether we want to isolate equality and ordering, even though the convention is that equal value should imply equal ordering. It might not be intuitive to call __equals
for ordering operations like >
, <
etc. >>>
No, equality does not automatically imply equal ordering. When comparing an object in a context that considers natural ordering, it should not consider equality when ordering isn't defined for that object.
Consider $a <= $b
: what should happen when __equals
returns FALSE? The only logical path would be to fall back to the default behaviour. The problem with this approach is that we are mixing contexts, and therefore responsibility. We should use either user-defined behaviour or default internal behaviour, not a mix of both. In this case, where __compareTo
is not implemented, $a <= $b
should use only the default behaviour and not consider __equals
at all.
If a class does not implement __equals
but __compareTo
returns 0,
should that be equivalent to __equals
returning TRUE?
The current behaviour of the internal is_equal_function returns TRUE if the internal compare_function returns 0. PHP therefore does not differentiate between ordering and equality.
Currently, the only way to override comparison is by using the internal compare and compare_objects handlers, where the result is used for both equality and ordering.
Yes, equal ordering implies equal value, but only if equality is not defined. This aligns with the existing behaviour of objects and extensions. It also makes __equals
optional in cases where it might be redundant, for example:
class Fraction { ... public function __compareTo(Fraction $other): int { return $this->num * $other->den <=> $other->num * $this->den; } }
PHP's current definition of equality (when ==
is called on two objects) require that
both objects be instances of the same class. Even if one class extends the other, ==
will be FALSE unless they are the exact same class.
Only internal classes and extensions can currently override this behaviour. For example, GMP does not require the other value to be a GMP number, ie. gmp_init(42) == 42
will be TRUE. This is because the compare object handler allows for an object to be compared to any other type.
An advantage of restricting operands to the same class is that equality would be symmetrical and transitive by design, ie. $a == $b
must mean that $b == $a
, and $a < $b && $b < $c
must mean that $a < $c
.
This decision comes down to power and responsibility vs. restriction and consistency. We can enforce certain rules internally at the cost of flexibility, or we can put more trust in the user to give them more control.
No, the methods are not restricted to instances of the same class. It is the implementor's responsibility to consider the definition of equality and ordering against other types. This also removes the inconsistency between internal and user-defined classes by providing a hook to the compare handler.
Ordering an object that doesn't implement __compareTo
is most likely unintended, but we don't want to break existing code by failing hard when ordering is not defined. The only viable situation to fail for is when __equals
is implemented but __compareTo
is not.
Ordering objects that don't implement __compareTo
will use the existing rules and will not be affected by this patch.
Currently, the internal compare_function assigns either -1, 0 or 1 to the result, which is why <=>
always evaluates to one of those three. When a class implements __compareTo
in such a way that it returns something else, should <=>
still normalize to return only -1, 0, and 1 based on the returned value?
This is an interesting consideration because calling __compareTo
directly will return the raw value, but using the <=>
operator will evaluate to the normalized value. This leads to a bigger discussion around operator overloading, which has been a contentuous issue in PHP for a long time.
This comes down to whether we should overload the operator, or the behaviour of the operator. For example, is $a < $b
shorthand for “is $a less than $b?” (and always evaluate to a boolean), or should $a < $b
not have any semantics and be allowed to evaluate to any type? Similarly, does <=>
semantically mean “how should $a be ordered relative to $b” and always return -1, 0 or 1, or will we someday want to overload it without restriction?
It's worth considering that some operations should be free to be overloaded without restriction.
Operators like +
, -
, *
, /
, |
, &
, etc have semantics that perform an action, rather than ask a question. For example, adding two lists together with +
, a set union with |
, or a matrix multiplication with *
all make perfectly clear semantic sense and it should be easy to guess what the operation might do. Comparison operators are different because they are defined by the questions that they ask. If $a < $b
has the potential to be a number or an object, it would make reading and understanding code a lot more difficult: <
would not mean “less than”, it would just be the left caret operator that means “less than” by default.
This RFC is focused on overriding the behaviour behind the operators, not the operators themselves. It's providing the ability to define equality and relative natural ordering, while preserving the semantics of the operators. Operator overloading is an unavoidable side-effect of overriding the behaviour to which the operators are semantically attached to. It would otherwise be very confusing if a class implements __compareTo
but doesn't honour it for comparison operators.
Operator overloading in the future would introduce methods for each operator, such as __lt
and __eq
that will be invoked for the operators, but comparison and equality would still honour __compareTo
and __equals
. This is however entirely hypothetical but demonstrates that the intention here is not to overload the operators.
The behaviour of the operator is not equivalent to invoking the method, so <=>
will normalize the return value of __compareTo
to always be -1, 0 or 1. This is consistent with the current behaviour of <=>
so that we do not break backward compatibility for cases that expect exactly -1, 0 or 1.
Non-integers will be converted to int
, except for float
which will be normalized using the sign function.
NULL and FALSE will be treated as 0. This is potentially dangerous because they would both indicate “equal to”. It will therefore be recommended in the documentation that users should specify int
as the return type.
Objects that can't be converted to an integer will raise a warning and be considered equal to 1.
Yes. In this case, $a->__compareTo($b)
will be attempted first, then $b->__compareTo($a)
, and finally the default behaviour of $a.
Yes, comparison isn't guaranteed to be symmetrical or transitive. It's up to the implementor to ensure that cases like these do not exist. The engine honours the implementor's logic and does not attempt to make any assertions. Enforcing symmetry and transitivity would require multiple comparisons and assertions per operation, or type restriction. This RFC opts for neither.
Order of attempt:
$a->__equals($b)
$b->__equals($a)
$a->__compareTo($b)
$b->__compareTo($a)
Because this is comparison in the context of equality, __equals
takes precedence over __compareTo
even if that means using the secondary operand's implementation. If both $a and $b follow sensible logic (within their domain), $a->__compareTo($b)
should return 0 anyway. An inconsistency here indicates a logic error in the implementation of either $a
or $b
.
This came up in discussion as something to consider. We should try to learn from the motivations behind decisions that changed direction in other programming languages so that we don't miss important concerns.
This is a relevant case because Python had a __cmp__ magic method that is similar to the proposed __compareTo
, but it became apparent that allowing only three cases for the relative order of objects is too limiting. This led to the introduction of "rich comparison" methods in Python 2.1 on top of the existing __cmp__
method. This allowed for classes to overload individual comparison operators without restriction so that projects like NumPy can achieve expressions like these:
list(np.array([1, 3, 5, 7]) < np.array([2, 1, 6, 6])) # Outputs: [True, False, True, False]
The __cmp__
method was removed completely in Python 3 in favour of rich comparison. However, I've come across a discussion between Python core contributors where they question this decision. That's not to say that the decision was a bad one, but it's clearly a difficult problem to solve perfectly. PEP 207 is a good read for more details about the rich comparison methods.
How does this apply to the RFC?
There is no perfect solution. Python decided on a direction that suited their needs and philosophy. I don't believe that PHP should blindly follow whatever other languages are doing, so we should consider what feels right for PHP.
None.
Extensions that use the compare
object handler may be affected by this patch because NULL values (not null pointers) are now passed to the handler. Previously, any object would automatically be considered greater than NULL.
The result of the compare
handler will now only be used if the handler returned SUCCESS. Returning FAILURE will fall through to the compare_objects
handler. Currently, an object comparison will stop at the compare
handler if it is defined, regardless of success.
Extensions will now have the option of implementing the equals handler if they want to separate ordering from equality.
Any functions that use the internal comparison functions will be affected, which includes
array functions like sort
, array_search
and in_array
.
PHP 7.3
Since this is a language change, a 2/3 majority is required. The vote ends 2018-07-16.