rfc:saner-inc-dec-operators
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
rfc:saner-inc-dec-operators [2023/01/14 14:15] – Version 0.2 girgias | rfc:saner-inc-dec-operators [2023/07/17 14:52] (current) – Implemented girgias | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== PHP RFC: Path to Saner Increment/ | ====== PHP RFC: Path to Saner Increment/ | ||
- | * Version: 0.2 | + | * Version: 0.3 |
* Date: 2022-11-21 | * Date: 2022-11-21 | ||
* Author: George Peter Banyard, < | * Author: George Peter Banyard, < | ||
- | * Status: | + | * Status: |
- | * Target Version: PHP 8.3 and PHP 9.0 | + | * Target Version: PHP 8.3, PHP 8.(3+x), |
- | * Implementation: | + | * Implementation: |
* First Published at: [[http:// | * First Published at: [[http:// | ||
Line 14: | Line 14: | ||
(([[rfc: | (([[rfc: | ||
(([[rfc: | (([[rfc: | ||
- | have been made to improve the behaviour of these operators. | + | have been made to improve the behaviour of these operators, but none have been implemented. |
- | But none have been implemented. | + | The goal of this RFC is to normalize the behaviour of '' |
- | As a design principle, we will follow that the increment and decrement operators should behave like adding or subtracting 1 respectively. | + | Therefore, we will first look at the behaviour of arithmetic operators with various types, then detail the current behaviour of the increment and decrement operators, and finally propose various changes to fix the discrepancies. |
- | + | ||
- | Therefore, we will first look at the behaviour of arithmetic operators with various types, then detail the current behaviour of the increment and decrement operators, and finally propose various changes to fix the dependencies. | + | |
==== Behaviour of arithmetic operators === | ==== Behaviour of arithmetic operators === | ||
Line 26: | Line 24: | ||
< | < | ||
- | In this context if either operand is a float (or not interpretable as an int), both operands are interpreted as floats, and the result will be a float. Otherwise, the operands will be interpreted as ints, and the result will also be an int. As of PHP 8.0.0, if one of the operands cannot be interpreted a TypeError is thrown. | + | In this context if either operand is a float (or not interpretable as an int), both operands are interpreted as floats, and the result will be a float. Otherwise, the operands will be interpreted as ints, and the result will also be an int. As of PHP 8.0.0, if one of the operands cannot be interpreted a '' |
</ | </ | ||
Line 35: | Line 33: | ||
* '' | * '' | ||
* '' | * '' | ||
- | * '' | + | |
- | + | < | |
- | All other cases throw a '' | + | var_dump(null + 1); // int(1) |
+ | var_dump(null - 1); // int(-1) | ||
+ | |||
+ | var_dump(false + 1); // int(1) | ||
+ | var_dump(false - 1); // int(-1) | ||
+ | |||
+ | var_dump(true + 1); // int(2) | ||
+ | var_dump(true - 1); // int(0) | ||
+ | |||
+ | var_dump(" | ||
+ | var_dump(" | ||
+ | var_dump(" | ||
+ | var_dump(" | ||
+ | </ | ||
+ | |||
+ | Resources, non-numeric strings, arrays (except when adding two arrays together), and objects that are instances of userland classes throw a '' | ||
+ | |||
+ | Object values that are instances of an internal class that overload the arithmetic operator (by implementing the '' | ||
+ | If an internal class implements a custom '' | ||
+ | Otherwise, a '' | ||
+ | |||
+ | One example | ||
+ | <PHP> | ||
+ | $o = gmp_init(36); | ||
+ | var_dump($o + 1); | ||
+ | /* | ||
+ | object(GMP)# | ||
+ | [" | ||
+ | string(2) " | ||
+ | } | ||
+ | */ | ||
+ | </ | ||
+ | |||
+ | The only examples of an internal class that does not implement | ||
+ | < | ||
+ | $o = tidy_parse_string("< | ||
+ | var_dump($o + 1); // int(1) | ||
+ | </ | ||
Note: the empty string has **// | Note: the empty string has **// | ||
+ | |||
+ | Note: If an internal class implements a custom '' | ||
+ | <PHP> | ||
+ | $o = curl_init(); | ||
+ | var_dump((int) $o); // e.g. int(1) | ||
+ | var_dump($o + 1); // Fatal error: Uncaught TypeError: Unsupported operand types: CurlHandle + int | ||
+ | </ | ||
==== Current behaviour of the increment and decrement operators ==== | ==== Current behaviour of the increment and decrement operators ==== | ||
Line 48: | Line 91: | ||
* the value is of type '' | * the value is of type '' | ||
* the value is of type '' | * the value is of type '' | ||
- | * the value is of type '' | ||
* the value is of type '' | * the value is of type '' | ||
+ | |||
+ | <PHP> | ||
+ | $int = 10; | ||
+ | var_dump(++$int); | ||
+ | $int = 10; | ||
+ | var_dump(--$int); | ||
+ | |||
+ | $float = 5.7; | ||
+ | var_dump(++$float); | ||
+ | $float = 5.7; | ||
+ | var_dump(--$float); | ||
+ | |||
+ | $false = false; | ||
+ | var_dump(++$false); | ||
+ | var_dump(--$false); | ||
+ | $true = true; | ||
+ | var_dump(++$true); | ||
+ | var_dump(--$true); | ||
+ | |||
+ | $stringInt = " | ||
+ | var_dump(++$stringInt); | ||
+ | var_dump(--$stringInt); | ||
+ | $stringFloat = " | ||
+ | var_dump(++$stringFloat); | ||
+ | var_dump(--$stringFloat); | ||
+ | </ | ||
+ | |||
+ | Object values that are instances of an internal class that overload the arithmetic operator (by implementing the '' | ||
+ | <PHP> | ||
+ | $o = gmp_init(36); | ||
+ | var_dump(++$o); | ||
+ | /* | ||
+ | object(GMP)# | ||
+ | [" | ||
+ | string(2) " | ||
+ | } | ||
+ | */ | ||
+ | |||
+ | $o = tidy_parse_string("< | ||
+ | var_dump(++$o); | ||
+ | </ | ||
- | For non-numeric '' | + | For non-numeric '' |
=== Current behaviour of the decrement operator with values of type null and non-numeric string === | === Current behaviour of the decrement operator with values of type null and non-numeric string === | ||
- | No action is performed, except if the value is the empty string, in which case the value is set to the integer '' | + | If the value is of type '' |
+ | |||
+ | If the value is a non-numeric '' | ||
+ | |||
+ | < | ||
+ | $n = null; | ||
+ | --$n; | ||
+ | var_dump($n); | ||
+ | |||
+ | $s = " | ||
+ | --$s; | ||
+ | var_dump($s); | ||
+ | |||
+ | $e = ""; | ||
+ | --$e; | ||
+ | var_dump($e); | ||
+ | </ | ||
=== Current behaviour of the increment operator with values of type null and non-numeric string === | === Current behaviour of the increment operator with values of type null and non-numeric string === | ||
- | If the value is of type '' | + | If the value is of type '' |
If the value is a non-numeric '' | If the value is a non-numeric '' | ||
+ | |||
+ | <PHP> | ||
+ | $n = null; | ||
+ | ++$n; | ||
+ | var_dump($n); | ||
+ | |||
+ | $s = " | ||
+ | ++$s; | ||
+ | var_dump($s); | ||
+ | |||
+ | $e = ""; | ||
+ | ++$e; | ||
+ | var_dump($e); | ||
+ | </ | ||
Note: this means that the behaviour around the empty string differs between both operators. Because for '' | Note: this means that the behaviour around the empty string differs between both operators. Because for '' | ||
Line 77: | Line 190: | ||
*/ | */ | ||
</ | </ | ||
+ | |||
+ | === Details about the PERL String increment feature === | ||
+ | |||
+ | If the string to increment is the empty string, return the string ''" | ||
+ | |||
+ | Otherwise, the last byte of the string is inspected: | ||
+ | * If it is in-between " | ||
+ | * If if is " | ||
+ | * Otherwise, do nothing. | ||
+ | |||
+ | If, and only if, a carry value is held after having inspected the first byte of the string. The string is prepended the character " | ||
+ | |||
+ | Here are a couple examples demonstrating these rules: | ||
+ | <PHP> | ||
+ | <?php | ||
+ | |||
+ | // Empty string | ||
+ | $s = ""; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // String increments are unaware of being " | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // Carrying values of different cases/types | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // Carrying values until the beginning of the string | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // Trailing whitespace | ||
+ | $s = "Z "; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // Leading whitespace | ||
+ | $s = " Z"; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // Whitespace in-between | ||
+ | $s = "C Z"; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // Non-ASCII characters | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // With period | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // With multiple period | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | </ | ||
+ | |||
+ | The behaviour is slightly different than that of [[https:// | ||
+ | |||
+ | <code raku> | ||
+ | sub var_dump(Str $v) { | ||
+ | say ' | ||
+ | } | ||
+ | |||
+ | # Empty string | ||
+ | my $s = ""; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # String increments are unaware of being " | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # Carrying values of different cases/types | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # Carrying values until the beginning of the string | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # Trailing whitespace | ||
+ | $s = "Z "; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # Leading whitespace | ||
+ | $s = " Z"; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # Whitespace in-between | ||
+ | $s = "C Z"; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # Non-ASCII characters | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # With period | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # With multiple period | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | </ | ||
+ | |||
+ | However, the biggest problem is with strings that can be interpreted as a number in scientific notation, because they will never be interpreted as an alphanumeric string to be incremented using the PERL increment feature, but converted to float first: | ||
+ | <PHP> | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | var_dump(++$s); | ||
+ | </ | ||
+ | |||
+ | While Raku also supports arithmetic operations with strings that represent number in scientific notation, it does not perform any type juggling at all for the increment and decrement operators (therefore having the same behaviour as currently for boolean and its corresponding '' | ||
+ | |||
+ | Therefore the above snippet in Raku gives a consistent result: | ||
+ | <code raku> | ||
+ | sub var_dump(Str $v) { | ||
+ | say ' | ||
+ | } | ||
+ | |||
+ | my $s = " | ||
+ | var_dump(++$s); | ||
+ | var_dump(++$s); | ||
+ | </ | ||
+ | |||
+ | ===== Summary of behavioural differences ===== | ||
+ | |||
+ | | | ||
+ | ^ '' | ||
+ | ^ '' | ||
+ | ^ '' | ||
+ | ^ ''""'' | ||
+ | ^ ''" | ||
+ | ^ Tidy Object | '' | ||
===== Proposal ===== | ===== Proposal ===== | ||
- | The proposal is to create a path so that in the next major version of PHP the increment and decrement operators behave identically to adding/ | + | The proposal is to create a path so that in the next major version of PHP the increment and decrement operators behave identically to adding/ |
To achieve this, we propose the following changes to be made in the next minor version of PHP: | To achieve this, we propose the following changes to be made in the next minor version of PHP: | ||
+ | |||
+ | * Add the < | ||
* Add support to increment/ | * Add support to increment/ | ||
- | * to emit <php>E_WARNING</php>s when the operators currently do not have any behaviour when they would if replace with a proper addition/subtraction | + | <PHP> |
- | * Deprecate using those operators with non-numeric strings. | + | $o = tidy_parse_string("< |
+ | var_dump(++$o); | ||
+ | </ | ||
+ | * to emit < | ||
+ | <PHP> | ||
+ | $n = null; | ||
+ | --$n; // Warning: Decrement on type null has no effect, this will change in the next major version of PHP | ||
+ | var_dump($n); | ||
+ | |||
+ | $false = false; | ||
+ | --$false; // Warning: Decrement on type bool has no effect, this will change in the next major version of PHP | ||
+ | var_dump($false); | ||
+ | ++$false; // Warning: Increment on type bool has no effect, this will change in the next major version of PHP | ||
+ | var_dump($false); | ||
+ | |||
+ | $true = true; | ||
+ | --$true; // Warning: Decrement on type bool has no effect, this will change in the next major version of PHP | ||
+ | var_dump($true); | ||
+ | ++$true; // Warning: Increment on type bool has no effect, this will change in the next major version of PHP | ||
+ | var_dump($true); | ||
+ | </ | ||
+ | |||
+ | |||
+ | * Deprecate using the decrement operator with non-numeric strings. | ||
+ | <PHP> | ||
+ | $empty = ""; | ||
+ | --$empty // Deprecated: Decrement on empty string is deprecated as non-numeric | ||
+ | var_dump($empty); | ||
+ | |||
+ | $s = " | ||
+ | --$s; // Deprecated: Decrement on non-numeric string has no effect and is deprecated | ||
+ | var_dump($s); | ||
+ | </ | ||
+ | |||
+ | * Deprecate using the increment operator with strings that are not strictly alphanumeric. | ||
+ | <PHP> | ||
+ | $empty = ""; | ||
+ | ++$empty // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($empty); | ||
+ | |||
+ | $s = " | ||
+ | ++$s; // No Deprecation | ||
+ | var_dump($s); | ||
+ | |||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | |||
+ | $s = "Z "; | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | |||
+ | $s = " Z"; | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | |||
+ | # Non-ASCII characters | ||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | |||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | </ | ||
+ | |||
+ | In a follow-up minor version of PHP the following changes will take place: | ||
+ | * Deprecate using the increment operator with non-numeric strings. | ||
+ | <PHP> | ||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-numeric string is deprecated | ||
+ | var_dump($s); | ||
+ | </ | ||
In the next major version of PHP the following changes will take place: | In the next major version of PHP the following changes will take place: | ||
Line 92: | Line 478: | ||
* Non-numeric string values throw a '' | * Non-numeric string values throw a '' | ||
+ | ==== Semantics of str_increment() and str_decrement() ==== | ||
+ | |||
+ | The signature of the functions are: | ||
+ | <PHP> | ||
+ | function str_increment(string $string): string {} | ||
+ | function str_decrement(string $string): string {} | ||
+ | </ | ||
+ | |||
+ | If < | ||
+ | |||
+ | If decrementing < | ||
+ | |||
+ | As those functions would not be performing any type juggling strings that can be interpreted as numbers in scientific notation will not be implicitly converted to float. | ||
+ | |||
+ | <PHP> | ||
+ | $s = " | ||
+ | $s = str_increment($s); | ||
+ | var_dump($s); | ||
+ | $s = str_increment($s); | ||
+ | var_dump($s); | ||
+ | </ | ||
==== Cost/ | ==== Cost/ | ||
- | PHP currently has 6 main and 3 operation specific type juggling contexts. | + | PHP currently has 6 main and 4 operation-specific type juggling contexts. |
- | The main 6 are documented in the userland manual on the type juggling page and are as follows: | + | The main 6 are documented in the userland manual on the [[https:// |
* Numeric | * Numeric | ||
* String | * String | ||
Line 104: | Line 511: | ||
* Function | * Function | ||
- | The 3 operation specific | + | The 4 operation-specific |
* Increment/ | * Increment/ | ||
* String offsets | * String offsets | ||
* Array offsets | * Array offsets | ||
- | | + | |
With the semantics proposed in this RFC the increment/ | With the semantics proposed in this RFC the increment/ | ||
- | The drawback of this approach is the deprecation, | + | The drawback of this approach is the deprecation, |
+ | However, | ||
+ | and adding support for string decrements | ||
+ | makes us believe | ||
+ | |||
+ | Therefore, we consider the value of reducing the semantic complexity of PHP higher than keeping support for this feature | ||
+ | The introduction of the < | ||
+ | < | ||
+ | function str_increment_polyfill(string $s): string { | ||
+ | if (is_numeric($s)) { | ||
+ | $offset = stripos($s, ' | ||
+ | if ($offset !== false) { | ||
+ | /* Using increment operator would cast the string to float | ||
+ | * Therefore we manually increment it to convert it to an " | ||
+ | $c = $s[$offset]; | ||
+ | $c++; | ||
+ | $s[$offset] = $c; | ||
+ | $s++; | ||
+ | $s[$offset] = match ($s[$offset]) { | ||
+ | ' | ||
+ | ' | ||
+ | ' | ||
+ | ' | ||
+ | }; | ||
+ | return $s; | ||
+ | } | ||
+ | } | ||
+ | return ++$s; | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | ==== Impact of deprecating the PERL string increment feature on userland ==== | ||
+ | |||
+ | To determine the impact of this RFC on userland, the static analysis tool [[https://www.exakat.io/en/|Exakat]] was used. We analyzed 2909 open source projects, including the top 1000 composer packages, plus various private enterprise code bases. ((Raw results of the analysis are available as a [[https:// | ||
+ | |||
+ | The only non-false-positive use cases using the PERL string increment feature are: | ||
+ | |||
+ | * Generating a list of valid unicode (or ASCII) characters. The most popular project using this is HTMLPurifier, | ||
+ | * Generating sequential IDs. The main library doing this is amphp/amp, however a lot of other projects depend on this library. | ||
+ | * Incrementing a spreadsheet column. | ||
+ | |||
+ | In any of these cases, no deprecation notices would be emitted | ||
+ | As the first stage of this RFC also provides the < | ||
===== Backward Incompatible Changes ===== | ===== Backward Incompatible Changes ===== | ||
Line 124: | Line 574: | ||
One possible future scope is to add support to both arithmetic operations and the increment/ | One possible future scope is to add support to both arithmetic operations and the increment/ | ||
+ | |||
+ | One other possible extension is to add a < | ||
===== Proposed PHP Version ===== | ===== Proposed PHP Version ===== | ||
- | Next minor version, i.e. PHP 8.3.0, and next major version, i.e. PHP 9.0.0. | + | Next minor version, i.e. PHP 8.3.0, follow-up minor version, e.g. PHP 8.4.0, and next major version, i.e. PHP 9.0.0. |
===== Proposed Voting Choices ===== | ===== Proposed Voting Choices ===== | ||
Line 133: | Line 585: | ||
As per the voting RFC a yes/no vote with a 2/3 majority is needed for this proposal to be accepted. | As per the voting RFC a yes/no vote with a 2/3 majority is needed for this proposal to be accepted. | ||
- | Voting started on 2023-XX-XX and will end on 2023-XX-XX. | + | Voting started on 2023-06-28 and will end on 2023-07-12. |
<doodle title=" | <doodle title=" | ||
* Yes | * Yes | ||
Line 141: | Line 593: | ||
===== Implementation ===== | ===== Implementation ===== | ||
- | GitHub pull request: https:// | + | GitHub pull request: |
After the project is implemented, | After the project is implemented, | ||
- | * the version(s) it was merged into | + | * Version: PHP 8.3 |
- | * a link to the git commit(s) | + | * Implementation : |
* a link to the PHP manual entry for the feature | * a link to the PHP manual entry for the feature | ||
===== References ===== | ===== References ===== | ||
rfc/saner-inc-dec-operators.1673705713.txt.gz · Last modified: 2023/01/14 14:15 by girgias