rfc:saner-inc-dec-operators
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
rfc:saner-inc-dec-operators [2022/12/02 10:36] – Some rewording and link fixes girgias | rfc:saner-inc-dec-operators [2023/07/17 14:52] (current) – Implemented girgias | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== PHP RFC: Path to Saner Increment/ | ====== PHP RFC: Path to Saner Increment/ | ||
- | * Version: 0.1 | + | * Version: 0.3 |
* Date: 2022-11-21 | * Date: 2022-11-21 | ||
* Author: George Peter Banyard, < | * Author: George Peter Banyard, < | ||
- | * Status: | + | * Status: |
- | * Target Version: PHP 8.3 | + | * Target Version: PHP 8.3, PHP 8.(3+x), and PHP 9.0 |
- | * Implementation: | + | * Implementation: |
* First Published at: [[http:// | * First Published at: [[http:// | ||
Line 14: | Line 14: | ||
(([[rfc: | (([[rfc: | ||
(([[rfc: | (([[rfc: | ||
- | have been made to improve the behaviour of these operators. | + | have been made to improve the behaviour of these operators, but none have been implemented. |
- | But none have been implemented. | + | The goal of this RFC is to normalize the behaviour of '' |
- | We will first detail the current behaviour of the two operators, | + | Therefore, we will first look at the behaviour of arithmetic operators with various types, then detail the current behaviour of the increment and decrement |
- | ==== Current behaviour ==== | + | ==== Behaviour of arithmetic operators |
- | If the value is of type '' | + | Arithmetic operators perform a numeric |
- | If the value is of type '' | + | < |
+ | In this context if either operand | ||
+ | </ | ||
- | If the value is of type '' | ||
- | If the value is of type '' | + | The following types (other than int and float) are considered interpretable as int/float: |
- | If the value is of type '' | + | * '' |
+ | * '' | ||
+ | * '' | ||
- | If the value is of type '' | + | < |
+ | var_dump(null + 1); // int(1) | ||
+ | var_dump(null - 1); // int(-1) | ||
- | * If the string is numeric, then a standard numeric type cast is performed, and the '' | + | var_dump(false + 1); // int(1) |
- | * Otherwise the string is a non-numeric string: | + | var_dump(false |
- | * If the decrement operator is used no action is performed, | + | |
- | * Else, the increment operator is used and a PERL alphanumeric string increment is performed. | + | |
+ | var_dump(true + 1); // int(2) | ||
+ | var_dump(true - 1); // int(0) | ||
+ | |||
+ | var_dump(" | ||
+ | var_dump(" | ||
+ | var_dump(" | ||
+ | var_dump(" | ||
+ | </ | ||
+ | |||
+ | Resources, non-numeric strings, arrays (except when adding two arrays together), and objects that are instances of userland classes throw a '' | ||
+ | |||
+ | Object values that are instances of an internal class that overload the arithmetic operator (by implementing the '' | ||
+ | If an internal class implements a custom '' | ||
+ | Otherwise, a '' | ||
+ | |||
+ | One example of an internal class that implements a '' | ||
+ | <PHP> | ||
+ | $o = gmp_init(36); | ||
+ | var_dump($o + 1); | ||
+ | /* | ||
+ | object(GMP)# | ||
+ | [" | ||
+ | string(2) " | ||
+ | } | ||
+ | */ | ||
+ | </ | ||
+ | |||
+ | The only examples of an internal class that does not implement a '' | ||
+ | <PHP> | ||
+ | $o = tidy_parse_string("< | ||
+ | var_dump($o + 1); // int(1) | ||
+ | </ | ||
+ | |||
+ | |||
+ | Note: the empty string has **// | ||
+ | |||
+ | Note: If an internal class implements a custom '' | ||
+ | <PHP> | ||
+ | $o = curl_init(); | ||
+ | var_dump((int) $o); // e.g. int(1) | ||
+ | var_dump($o + 1); // Fatal error: Uncaught TypeError: Unsupported operand types: CurlHandle + int | ||
+ | </ | ||
+ | |||
+ | ==== Current behaviour of the increment and decrement operators ==== | ||
+ | |||
+ | The current behaviour of these operators is rather complex and depends on which operator is used with which type. First, we will describe the common behaviour between both operators: | ||
+ | |||
+ | * the value is of type '' | ||
+ | * the value is of type '' | ||
+ | * the value is of type '' | ||
+ | * the value is of type '' | ||
+ | |||
+ | <PHP> | ||
+ | $int = 10; | ||
+ | var_dump(++$int); | ||
+ | $int = 10; | ||
+ | var_dump(--$int); | ||
+ | |||
+ | $float = 5.7; | ||
+ | var_dump(++$float); | ||
+ | $float = 5.7; | ||
+ | var_dump(--$float); | ||
+ | |||
+ | $false = false; | ||
+ | var_dump(++$false); | ||
+ | var_dump(--$false); | ||
+ | $true = true; | ||
+ | var_dump(++$true); | ||
+ | var_dump(--$true); | ||
+ | |||
+ | $stringInt = " | ||
+ | var_dump(++$stringInt); | ||
+ | var_dump(--$stringInt); | ||
+ | $stringFloat = " | ||
+ | var_dump(++$stringFloat); | ||
+ | var_dump(--$stringFloat); | ||
+ | </ | ||
+ | |||
+ | Object values that are instances of an internal class that overload the arithmetic operator (by implementing the '' | ||
+ | <PHP> | ||
+ | $o = gmp_init(36); | ||
+ | var_dump(++$o); | ||
+ | /* | ||
+ | object(GMP)# | ||
+ | [" | ||
+ | string(2) " | ||
+ | } | ||
+ | */ | ||
+ | |||
+ | $o = tidy_parse_string("< | ||
+ | var_dump(++$o); | ||
+ | </ | ||
+ | |||
+ | For non-numeric '' | ||
+ | |||
+ | === Current behaviour of the decrement operator with values of type null and non-numeric string === | ||
+ | |||
+ | If the value is of type '' | ||
+ | |||
+ | If the value is a non-numeric '' | ||
+ | |||
+ | <PHP> | ||
+ | $n = null; | ||
+ | --$n; | ||
+ | var_dump($n); | ||
+ | |||
+ | $s = " | ||
+ | --$s; | ||
+ | var_dump($s); | ||
+ | |||
+ | $e = ""; | ||
+ | --$e; | ||
+ | var_dump($e); | ||
+ | </ | ||
+ | |||
+ | === Current behaviour of the increment operator with values of type null and non-numeric string === | ||
+ | |||
+ | If the value is of type '' | ||
+ | |||
+ | If the value is a non-numeric '' | ||
+ | |||
+ | <PHP> | ||
+ | $n = null; | ||
+ | ++$n; | ||
+ | var_dump($n); | ||
+ | |||
+ | $s = " | ||
+ | ++$s; | ||
+ | var_dump($s); | ||
+ | |||
+ | $e = ""; | ||
+ | ++$e; | ||
+ | var_dump($e); | ||
+ | </ | ||
- | Note: this means that the behaviour around the empty string differs between both operators. Because for '' | + | Note: this means that the behaviour around the empty string differs between both operators. Because for '' |
<PHP> | <PHP> | ||
Line 53: | Line 190: | ||
*/ | */ | ||
</ | </ | ||
+ | |||
+ | === Details about the PERL String increment feature === | ||
+ | |||
+ | If the string to increment is the empty string, return the string ''" | ||
+ | |||
+ | Otherwise, the last byte of the string is inspected: | ||
+ | * If it is in-between " | ||
+ | * If if is " | ||
+ | * Otherwise, do nothing. | ||
+ | |||
+ | If, and only if, a carry value is held after having inspected the first byte of the string. The string is prepended the character " | ||
+ | |||
+ | Here are a couple examples demonstrating these rules: | ||
+ | <PHP> | ||
+ | <?php | ||
+ | |||
+ | // Empty string | ||
+ | $s = ""; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // String increments are unaware of being " | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // Carrying values of different cases/types | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // Carrying values until the beginning of the string | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // Trailing whitespace | ||
+ | $s = "Z "; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // Leading whitespace | ||
+ | $s = " Z"; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // Whitespace in-between | ||
+ | $s = "C Z"; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // Non-ASCII characters | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // With period | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // With multiple period | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | </ | ||
+ | |||
+ | The behaviour is slightly different than that of [[https:// | ||
+ | |||
+ | <code raku> | ||
+ | sub var_dump(Str $v) { | ||
+ | say ' | ||
+ | } | ||
+ | |||
+ | # Empty string | ||
+ | my $s = ""; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # String increments are unaware of being " | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # Carrying values of different cases/types | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # Carrying values until the beginning of the string | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # Trailing whitespace | ||
+ | $s = "Z "; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # Leading whitespace | ||
+ | $s = " Z"; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # Whitespace in-between | ||
+ | $s = "C Z"; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # Non-ASCII characters | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # With period | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # With multiple period | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | </ | ||
+ | |||
+ | However, the biggest problem is with strings that can be interpreted as a number in scientific notation, because they will never be interpreted as an alphanumeric string to be incremented using the PERL increment feature, but converted to float first: | ||
+ | <PHP> | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | var_dump(++$s); | ||
+ | </ | ||
+ | |||
+ | While Raku also supports arithmetic operations with strings that represent number in scientific notation, it does not perform any type juggling at all for the increment and decrement operators (therefore having the same behaviour as currently for boolean and its corresponding '' | ||
+ | |||
+ | Therefore the above snippet in Raku gives a consistent result: | ||
+ | <code raku> | ||
+ | sub var_dump(Str $v) { | ||
+ | say ' | ||
+ | } | ||
+ | |||
+ | my $s = " | ||
+ | var_dump(++$s); | ||
+ | var_dump(++$s); | ||
+ | </ | ||
+ | |||
+ | ===== Summary of behavioural differences ===== | ||
+ | |||
+ | | | ||
+ | ^ '' | ||
+ | ^ '' | ||
+ | ^ '' | ||
+ | ^ ''""'' | ||
+ | ^ ''" | ||
+ | ^ Tidy Object | '' | ||
===== Proposal ===== | ===== Proposal ===== | ||
- | The proposal is to create a path which creates awareness around | + | The proposal is to create a path so that in the next major version |
- | If the value is of type '' | + | To achieve this, we propose |
- | * Emit an <php>E_WARNING</ | + | * Add the <php>str_increment()</ |
+ | * Add support to increment/ | ||
+ | < | ||
+ | $o = tidy_parse_string("< | ||
+ | var_dump(++$o); | ||
+ | </ | ||
- | If the value is of type '' | + | * to emit < |
+ | < | ||
+ | $n = null; | ||
+ | --$n; // Warning: Decrement on type null has no effect, this will change in the next major version of PHP | ||
+ | var_dump($n); | ||
- | * Add support to the decrement operator. This would cast the value to the integer '' | + | $false = false; |
+ | --$false; // Warning: Decrement on type bool has no effect, this will change in the next major version of PHP | ||
+ | var_dump($false); | ||
+ | ++$false; // Warning: Increment on type bool has no effect, this will change in the next major version of PHP | ||
+ | var_dump($false); | ||
- | If the value is of type '' | + | $true = true; |
+ | --$true; // Warning: Decrement on type bool has no effect, this will change in the next major version | ||
+ | var_dump($true); | ||
+ | ++$true; // Warning: Increment on type bool has no effect, this will change in the next major version of PHP | ||
+ | var_dump($true); | ||
+ | </ | ||
- | * Emit an < | ||
- | * Deprecate PERL alphanumeric string increments by emitting an < | ||
- | * Deprecate using the decrement operator on an empty string, to align the behaviour with the PERL alphanumeric string increment deprecation, | ||
- | ==== Proposal Addendum ==== | + | * Deprecate using the decrement operator with non-numeric strings. |
+ | < | ||
+ | $empty | ||
+ | --$empty // Deprecated: Decrement on empty string is deprecated as non-numeric | ||
+ | var_dump($empty); | ||
- | As the behaviour around these operators and values of type '' | + | $s = " |
- | text in the section describing the changes to the '' | + | --$s; // Deprecated: Decrement on non-numeric string has no effect and is deprecated |
+ | var_dump($s); | ||
+ | </ | ||
- | | + | * Deprecate using the increment operator |
- | | + | <PHP> |
+ | $empty = ""; | ||
+ | ++$empty // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($empty); | ||
+ | $s = " | ||
+ | ++$s; // No Deprecation | ||
+ | var_dump($s); | ||
+ | |||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | |||
+ | $s = "Z "; | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | |||
+ | $s = " Z"; | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | |||
+ | # Non-ASCII characters | ||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | |||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | </ | ||
+ | |||
+ | In a follow-up minor version of PHP the following changes will take place: | ||
+ | * Deprecate using the increment operator with non-numeric strings. | ||
+ | <PHP> | ||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-numeric string is deprecated | ||
+ | var_dump($s); | ||
+ | </ | ||
+ | |||
+ | In the next major version of PHP the following changes will take place: | ||
+ | * Values of type '' | ||
+ | * Non-numeric string values throw a '' | ||
+ | |||
+ | ==== Semantics of str_increment() and str_decrement() ==== | ||
+ | |||
+ | The signature of the functions are: | ||
+ | <PHP> | ||
+ | function str_increment(string $string): string {} | ||
+ | function str_decrement(string $string): string {} | ||
+ | </ | ||
+ | |||
+ | If < | ||
+ | |||
+ | If decrementing < | ||
+ | |||
+ | As those functions would not be performing any type juggling strings that can be interpreted as numbers in scientific notation will not be implicitly converted to float. | ||
+ | |||
+ | <PHP> | ||
+ | $s = " | ||
+ | $s = str_increment($s); | ||
+ | var_dump($s); | ||
+ | $s = str_increment($s); | ||
+ | var_dump($s); | ||
+ | </ | ||
+ | |||
+ | ==== Cost/ | ||
+ | |||
+ | PHP currently has 6 main and 4 operation-specific type juggling contexts. | ||
+ | The main 6 are documented in the userland manual on the [[https:// | ||
+ | * Numeric | ||
+ | * String | ||
+ | * Logical | ||
+ | * Integral and string | ||
+ | * Comparative | ||
+ | * Function | ||
+ | |||
+ | The 4 operation-specific contexts are: | ||
+ | * Increment/ | ||
+ | * String offsets | ||
+ | * Array offsets | ||
+ | * < | ||
+ | |||
+ | With the semantics proposed in this RFC the increment/ | ||
+ | |||
+ | The drawback of this approach is the deprecation, | ||
+ | However, the issues around strings that can be interpreted in scientific notation, the fact it only properly supports strings which are only comprised of the ASCII alphanumeric characters ('' | ||
+ | and adding support for string decrements was previously [[rfc: | ||
+ | makes us believe the current semantics of the string increment feature are unsound. | ||
+ | |||
+ | Therefore, we consider the value of reducing the semantic complexity of PHP higher than keeping support for this feature in its current form. | ||
+ | The introduction of the < | ||
+ | <PHP> | ||
+ | function str_increment_polyfill(string $s): string { | ||
+ | if (is_numeric($s)) { | ||
+ | $offset = stripos($s, ' | ||
+ | if ($offset !== false) { | ||
+ | /* Using increment operator would cast the string to float | ||
+ | * Therefore we manually increment it to convert it to an " | ||
+ | $c = $s[$offset]; | ||
+ | $c++; | ||
+ | $s[$offset] = $c; | ||
+ | $s++; | ||
+ | $s[$offset] = match ($s[$offset]) { | ||
+ | ' | ||
+ | ' | ||
+ | ' | ||
+ | ' | ||
+ | }; | ||
+ | return $s; | ||
+ | } | ||
+ | } | ||
+ | return ++$s; | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | ==== Impact of deprecating the PERL string increment feature on userland ==== | ||
+ | |||
+ | To determine the impact of this RFC on userland, the static analysis tool [[https:// | ||
+ | |||
+ | The only non-false-positive use cases using the PERL string increment feature are: | ||
+ | |||
+ | * Generating a list of valid unicode (or ASCII) characters. The most popular project using this is HTMLPurifier, | ||
+ | * Generating sequential IDs. The main library doing this is amphp/amp, however a lot of other projects depend on this library. | ||
+ | * Incrementing a spreadsheet column. | ||
+ | |||
+ | In any of these cases, no deprecation notices would be emitted in the first stage of this RFC. | ||
+ | As the first stage of this RFC also provides the < | ||
===== Backward Incompatible Changes ===== | ===== Backward Incompatible Changes ===== | ||
- | The backwards incompatible | + | Using the increment/ |
+ | |||
+ | The string increment feature. | ||
+ | |||
+ | The changes | ||
+ | |||
+ | ===== Future Scope ===== | ||
+ | |||
+ | One possible future scope is to add support to both arithmetic operations and the increment/ | ||
- | The changes that introduce an <php>E_WARNING</ | + | One other possible extension is to add a <php>$step</ |
===== Proposed PHP Version ===== | ===== Proposed PHP Version ===== | ||
- | Next minor version, i.e. PHP 8.3. | + | Next minor version, i.e. PHP 8.3.0, follow-up minor version, e.g. PHP 8.4.0, and next major version, i.e. PHP 9.0.0. |
===== Proposed Voting Choices ===== | ===== Proposed Voting Choices ===== | ||
Line 95: | Line 585: | ||
As per the voting RFC a yes/no vote with a 2/3 majority is needed for this proposal to be accepted. | As per the voting RFC a yes/no vote with a 2/3 majority is needed for this proposal to be accepted. | ||
- | Voting started on 2022-XX-XX and will end on 2022-XX-XX. | + | Voting started on 2023-06-28 and will end on 2023-07-12. |
<doodle title=" | <doodle title=" | ||
- | * Yes | ||
- | * No | ||
- | </ | ||
- | |||
- | The addendum to this proposal will also require a 2/3 majority to be accepted. | ||
- | |||
- | Voting started on 2022-XX-XX and will end on 2022-XX-XX. | ||
- | <doodle title=" | ||
* Yes | * Yes | ||
* No | * No | ||
Line 111: | Line 593: | ||
===== Implementation ===== | ===== Implementation ===== | ||
- | GitHub pull request: https:// | + | GitHub pull request: |
After the project is implemented, | After the project is implemented, | ||
- | * the version(s) it was merged into | + | * Version: PHP 8.3 |
- | * a link to the git commit(s) | + | * Implementation : |
* a link to the PHP manual entry for the feature | * a link to the PHP manual entry for the feature | ||
===== References ===== | ===== References ===== | ||
rfc/saner-inc-dec-operators.1669977388.txt.gz · Last modified: 2022/12/02 10:36 by girgias