rfc:operator_overloading_gmp
Differences
This shows you the differences between two versions of the page.
Next revision | Previous revisionNext revisionBoth sides next revision | ||
rfc:operator_overloading_gmp [2013/05/12 14:11] – created nikic | rfc:operator_overloading_gmp [2013/06/17 15:20] – Close vote, RFC accepted nikic | ||
---|---|---|---|
Line 4: | Line 4: | ||
* Date: 2013-05-12 | * Date: 2013-05-12 | ||
* Author: Nikita Popov < | * Author: Nikita Popov < | ||
- | * Status: | + | * Status: |
* Patch: https:// | * Patch: https:// | ||
* Target version: PHP 5.6 (or whatever the next one is) | * Target version: PHP 5.6 (or whatever the next one is) | ||
- | PHP offers facilities for large number and decimal arithmetic (GMP and BCMath), but currently using those is a PITA. This RFC proposes to improve the situation by adding support for *operator overloading in internal classes*. Furthermore this RFC exemplarily implements the new API for GMP (and improves the GMP in various ways along the way). | + | ===== Introduction ===== |
- | ===== Proposal ===== | + | PHP offers facilities for large number and decimal arithmetic (GMP and BCMath), but currently using those is a PITA. This RFC proposes to improve the situation by adding support for **operator overloading in internal classes**. The operator overloading is exemplarily implemented for the GMP extension, while also improving GMP in various other ways along the way. |
+ | ===== Proposal A: Operator overloading ===== | ||
- | ===== Backward Incompatible Changes ===== | + | Note: This proposal is only about **internal** operator overloading and **not** about userland overloading. |
+ | ==== Why operator overloading? | ||
- | ===== Impact to Existing Extensions ===== | + | There are several reasons why overloaded operators are preferable over '' |
+ | The first is that code using overloaded operators is simply **more readable**. As an example, consider the following two code snippets, one using '' | ||
- | ===== Open Issues | + | <code php> |
+ | $result | ||
+ | gmp_add( | ||
+ | gmp_mul($c0, | ||
+ | gmp_add( | ||
+ | gmp_mul($c1, | ||
+ | gmp_mul($c2, | ||
+ | ) | ||
+ | ), | ||
+ | gmp_mul($n0, | ||
+ | ); | ||
+ | |||
+ | $result | ||
+ | $c0 * $ms0 * gmp_invert($ms0, | ||
+ | + $c1 * $ms1 * gmp_invert($ms1, | ||
+ | + $c2 * $ms2 * gmp_invert($ms2, | ||
+ | ) % ($n0 * $n1 * $n2); | ||
+ | </ | ||
+ | |||
+ | Even without understanding what the above code does (it's an excerpt from a Coppersmith attack on RSA), it should be obvious that the second code is a lot clearer. It makes the structure of the code immediately clear (three multiplications are summed up and the modulus is taken), whereas the function-based code actively hides any structure in the code. For mathematical operations infix notation just comes a lot more naturally. | ||
+ | |||
+ | Another advantage of overloaded operators is that it allows **polymorphism** for functions doing arithmetic operations. As an example, consider manually implementing a function like '' | ||
+ | |||
+ | <code php> | ||
+ | function powm($base, $exponent, $modulus) { | ||
+ | $result | ||
+ | while ($exponent > 0) { | ||
+ | if ($exponent % 2 == 1) { | ||
+ | $result = $result * $base % $modulus; | ||
+ | $exponent--; | ||
+ | } | ||
+ | $exponent /= 2; | ||
+ | $base = ($base * $base) % $modulus; | ||
+ | } | ||
+ | return $result; | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | With operator overloading this function will work with **any** type of " | ||
+ | |||
+ | <code php> | ||
+ | var_dump(powm(123, | ||
+ | var_dump(powm(gmp_init(123), | ||
+ | var_dump(powm( | ||
+ | gmp_init(" | ||
+ | gmp_init(" | ||
+ | gmp_init(" | ||
+ | )); // GMP(36912902142130032810) | ||
+ | </ | ||
+ | |||
+ | Without operator overloading this would not be possible. Instead one would have to implement the same function once using the normal '' | ||
+ | |||
+ | ==== Applications of operator overloading ==== | ||
+ | |||
+ | Some examples what the operator overloading capability can be used for, apart from the bignum arithmetic outlined in this RFC: | ||
+ | |||
+ | * Decimal arithmetic. This is particularly important in PHP as PHP commonly deals with monetary values which **can not** be represented as floating point numbers. | ||
+ | * Date calculations. Also very common in PHP ('' | ||
+ | * Ratio and complex arithmetic | ||
+ | * Unsigned arithmetic and arithmetic on other integral types PHP does not support (e.g. cross platform 64bit integers) | ||
+ | * Vector and matrix calculations | ||
+ | |||
+ | Due to potential pitfalls of misusing operator overloading known from other languages (most notably C++), the use of this new feature should be limited to cases where there are clear definitions to the behavior of all overloaded operators. The application of this feature should be for mathematical use cases only (as noted above), and not ' | ||
+ | |||
+ | ==== Technical proposal ==== | ||
+ | |||
+ | The operator overloading is implemented using two new object handlers: | ||
+ | |||
+ | === do_operation === | ||
+ | |||
+ | The '' | ||
+ | |||
+ | < | ||
+ | typedef int (*zend_object_do_operation_t)(zend_uchar opcode, zval *result, zval *op1, zval *op2 TSRMLS_DC); | ||
+ | </ | ||
+ | |||
+ | Here '' | ||
+ | |||
+ | The following '' | ||
+ | |||
+ | < | ||
+ | + | ||
+ | - | ||
+ | * | ||
+ | / | ||
+ | % | ||
+ | << | ||
+ | >> | ||
+ | . | ||
+ | | | ||
+ | & | ||
+ | ^ | ||
+ | xor ZEND_BOOL_XOR | ||
+ | ~ | ||
+ | ! | ||
+ | </ | ||
+ | |||
+ | The unary '' | ||
+ | |||
+ | < | ||
+ | +$a ==> | ||
+ | -$a ==> | ||
+ | </ | ||
+ | |||
+ | The compound assignment operators '' | ||
+ | |||
+ | The prefix operators '' | ||
+ | |||
+ | === compare === | ||
+ | |||
+ | The '' | ||
+ | |||
+ | < | ||
+ | typedef int (*zend_object_compare_zvals_t)(zval *result, zval *op1, zval *op2 TSRMLS_DC); | ||
+ | </ | ||
+ | |||
+ | Here '' | ||
+ | |||
+ | The '' | ||
+ | |||
+ | The difference between the '' | ||
+ | |||
+ | ===== Proposal B: GMP Improvements ===== | ||
+ | |||
+ | Currently GMP is based on resources. This has several disadvantages: | ||
+ | |||
+ | * Cannot be serialized | ||
+ | * Cannot be directly cast to int/ | ||
+ | * Cannot be (meaningfully) dumped using '' | ||
+ | * Coerces to an integer by returning the resource ID. This can easily lead to bugs if you accidentally use the resource with an arithmetic operation. For example a GMP factorial test from out testsuite has been computing the factorial of the resource ID, rather than the factorial of the number. | ||
+ | * Cannot make use of the new operator overloading APIs | ||
+ | * Bad reporting on leaks. During the port I found that many functions leak resources, especially in error conditions. | ||
+ | |||
+ | This RFC proposes to make GMP use objects (of type '' | ||
+ | |||
+ | === Casting === | ||
+ | |||
+ | <code php> | ||
+ | $n = gmp_init(42); | ||
+ | echo $n, " | ||
+ | var_dump((string) $n); // string(2) " | ||
+ | var_dump((int) $n); // int(42) | ||
+ | var_dump((float) $n); // float(42) | ||
+ | </ | ||
+ | |||
+ | === Serializing and dumping === | ||
+ | |||
+ | <code php> | ||
+ | var_dump($n = gmp_init(42)); | ||
+ | var_dump($s = serialize($n)); | ||
+ | var_dump(unserialize($s)); | ||
+ | |||
+ | // outputs | ||
+ | |||
+ | object(GMP)# | ||
+ | [" | ||
+ | string(2) " | ||
+ | } | ||
+ | string(33) " | ||
+ | object(GMP)# | ||
+ | [" | ||
+ | string(2) " | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | === Cloning === | ||
+ | |||
+ | <code php> | ||
+ | $a = gmp_init(3); | ||
+ | $b = clone $a; | ||
+ | gmp_clrbit($a, | ||
+ | var_dump($a, | ||
+ | |||
+ | // Output: (Note that $b is still 3) | ||
+ | |||
+ | object(GMP)# | ||
+ | [" | ||
+ | string(1) " | ||
+ | } | ||
+ | object(GMP)# | ||
+ | [" | ||
+ | string(1) " | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | === Overloaded operators === | ||
+ | |||
+ | <code php> | ||
+ | $a = gmp_init(42); | ||
+ | $b = gmp_init(17); | ||
+ | |||
+ | var_dump($a + $b); | ||
+ | var_dump($a + 17); | ||
+ | var_dump(42 + $b); | ||
+ | |||
+ | // Outputs the following 3 times: | ||
+ | |||
+ | object(GMP)# | ||
+ | [" | ||
+ | string(2) " | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | The following operators are supported: '' | ||
+ | |||
+ | === Overloaded operators: Comparison === | ||
+ | |||
+ | <code php> | ||
+ | $a = gmp_init(42); | ||
+ | var_dump($a == 42, $a == 17, $a < 40, $a < 100); | ||
+ | // | ||
+ | </ | ||
+ | |||
+ | Comparison is supported via the '' | ||
+ | |||
+ | <code php> | ||
+ | $arr = [gmp_init(0), | ||
+ | sort($arr); | ||
+ | var_dump($arr); | ||
+ | |||
+ | // Outputs | ||
+ | |||
+ | array(4) { | ||
+ | [0]=> | ||
+ | int(-3) | ||
+ | [1]=> | ||
+ | object(GMP)# | ||
+ | [" | ||
+ | string(1) " | ||
+ | } | ||
+ | [2]=> | ||
+ | int(1) | ||
+ | [3]=> | ||
+ | object(GMP)# | ||
+ | [" | ||
+ | string(1) " | ||
+ | } | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | === Other minor changes === | ||
+ | |||
+ | During the refactoring of the implementation a few additional, small changes were done: | ||
+ | |||
+ | * If you pass a GMP instance to '' | ||
+ | * Previously some functions like '' | ||
+ | * Due to the previous change '' | ||
+ | * If you pass an invalid rounding mode to a function, you will now get a warning. | ||
+ | |||
+ | ===== Backward Incompatible Changes | ||
+ | The addition of operator overloading does not break backwards compatibility. | ||
+ | The switch from GMP resources to objects can break scripts that checked whether something is a GMP integer using code like '' | ||
- | ===== Unaffected PHP Functionality | + | ===== Performance |
+ | The addition of operator overloading does not affect performance (or at least I couldn' | ||
- | ===== Future Scope ===== | + | The changes to GMP improve performance in all scenarios I measured (4M runs each): |
+ | < | ||
+ | | ||
+ | a) gmp_add($a, $b) | ||
+ | b) gmp_add($a, 17) | ||
+ | c) gmp_add(42, $b) | ||
+ | d) $a + $b | ||
+ | </ | ||
- | ===== Proposed Voting Choices ===== | + | The difference between tests b) and c) is that the former makes use of an operator specialized on integers rather than creating a temporary GMP instance. |
+ | ===== Patch ===== | ||
- | ===== Patches and Tests ===== | + | The pull request for this RFC can be found here: https:// |
+ | ===== Vote ===== | ||
- | ===== References ===== | + | The vote started on 10.06.2013 and ended on 17.06.2013. Both proposals are accepted. |
+ | <doodle title=" | ||
+ | * Internal operator overloading | ||
+ | * GMP changes | ||
+ | * None | ||
+ | </ | ||
- | ===== Rejected Features | + | ===== Previous discussions |
+ | http:// |
rfc/operator_overloading_gmp.txt · Last modified: 2017/09/22 13:28 by 127.0.0.1