This is an old revision of the document!
PHP RFC: Internal operator overloading and GMP improvements
- Version: 0.1
- Date: 2013-05-12
- Author: Nikita Popov nikic@php.net
- Status: Draft
- Target version: PHP 5.6 (or whatever the next one is)
Introduction
PHP offers facilities for large number and decimal arithmetic (GMP and BCMath), but currently using those is a PITA. This RFC proposes to improve the situation by adding support for operator overloading in internal classes. Furthermore this RFC exemplarily implements the new API for GMP (and improves GMP in various ways along the way).
TODO: More motivation!
Proposal A: Operator overloading
The operator overloading is implemented by adding a new object handler do_operation
with the following signature:
typedef int (*zend_object_do_operation_t)(zend_uchar opcode, zval *result, zval *op1, zval *op2 TSRMLS_DC);
Here opcode
is the opcode of the operation (e.g. ZEND_ADD
), result
is the target zval, op1
the first operand and op2
the second operand. For binary operations both operands are used, for unary operations the second operand is NULL
. The return value can be either SUCCESS
or FAILURE
. If FAILURE
is returned then the code falls back to the default behavior for the respective operator.
The following opcode
values are supported:
+ ZEND_ADD - ZEND_SUB * ZEND_MUL / ZEND_DIV % ZEND_MOD << ZEND_SL >> ZEND_SR . ZEND_CONCAT | ZEND_BW_OR & ZEND_BW_AND ^ ZEND_BW_XOR xor ZEND_BOOL_XOR ~ ZEND_BW_NOT (unary) ! ZEND_BOOL_NOT (unary) == ZEND_IS_EQUAL != ZEND_IS_NOT_EQUAL < ZEND_IS_SMALLER <= ZEND_IS_SMALLER_OR_EQUAL
The operators >
, >=
, unary +
and unary -
are indirectly supported by the following compiler transformations:
$a > $b ==> $b < $a $a >= $b ==> $b <= $a +$a ==> 0 + $a -$a ==> 0 - $a
The compound assignment operators +=
, -=
, *=
, /=
, %=
, <<=
, >>=
, .=
, |=
, &=
and ^=
are supported by the runtime transformation $a op= $b => $a = $a op $b
.
The prefix operators ++
and --
are supported by the runtime transformations ++$a => $a = $a + 1
and --$b => $b = $b - 1
. The same applies for the corresponding postfix operators, with the difference that a copy of the old value is returned (rather than the newly computed value).
The operators ===
and !==
are explicitly not supported. For objects they have clearly defined semantics (the same object handle) and I see no reason why one should be allowed to break this behavior.
The ability to overload operators is not exposed to userland classes.
Proposal B: GMP Improvements
Currently GMP is based on resources. This has several disadvantages:
- Cannot be serialized
- Cannot be directly cast to int/float/string
- Cannot be (meaningfully) dumped using
var_dump
- Coerces to an integer by returning the resource ID. This can easily lead to bugs if you accidentally use the resource with an arithmetic operation. For example a GMP factorial test from out testsuite has been computing the factorial of the resource ID, rather than the factorial of the number.
- Cannot make use of the new operator overloading APIs
This RFC proposes to make GMP use objects (of type GMP
) as the underlying structure. Using this new structure, the RFC implements support for serialization, casting, dumping and overloaded operators.
In the following there are examples for some of the new behaviors:
Casting
$n = gmp_init(42); echo $n, "\n"; // 42 var_dump((string) $n); // string(2) "42" var_dump((int) $n); // int(42) var_dump((float) $n); // float(42)
Serializing and dumping
var_dump($n = gmp_init(42)); var_dump($s = serialize($n)); var_dump(unserialize($s)); // outputs object(GMP)#%d (1) { ["num"]=> string(2) "42" } string(33) "O:3:"GMP":1:{s:3:"num";s:2:"42";}" object(GMP)#%d (1) { ["num"]=> string(2) "42" }
Overloaded operators
$a = gmp_init(42); $b = gmp_init(17); var_dump($a + $b); var_dump($a + 17); var_dump(42 + $b); // Outputs the following 3 times: object(GMP)#%d (1) { ["num"]=> string(2) "59" }
The following operators are supported: +
, -
, *
, /
, %
, |
, &
, ^
, ~
, ==
, !=
, <
, <=
, >
, >=
. The operators <<
, >>
are not yet supported, but support is planned. All operators work with two GMP values or one GMP value and one GMP-coercible value (e.g. strings and integers).