rfc:saner-inc-dec-operators
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
rfc:saner-inc-dec-operators [2023/01/17 01:53] – Add summary table + PERL increment example girgias | rfc:saner-inc-dec-operators [2023/07/17 14:52] (current) – Implemented girgias | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== PHP RFC: Path to Saner Increment/ | ====== PHP RFC: Path to Saner Increment/ | ||
- | * Version: 0.2 | + | * Version: 0.3 |
* Date: 2022-11-21 | * Date: 2022-11-21 | ||
* Author: George Peter Banyard, < | * Author: George Peter Banyard, < | ||
- | * Status: | + | * Status: |
- | * Target Version: PHP 8.3 and PHP 9.0 | + | * Target Version: PHP 8.3, PHP 8.(3+x), |
- | * Implementation: | + | * Implementation: |
* First Published at: [[http:// | * First Published at: [[http:// | ||
Line 68: | Line 68: | ||
</ | </ | ||
- | The only examples of an internal class that does not implements | + | The only examples of an internal class that does not implement |
<PHP> | <PHP> | ||
$o = tidy_parse_string("< | $o = tidy_parse_string("< | ||
Line 134: | Line 134: | ||
</ | </ | ||
- | For non-numeric '' | + | For non-numeric '' |
=== Current behaviour of the decrement operator with values of type null and non-numeric string === | === Current behaviour of the decrement operator with values of type null and non-numeric string === | ||
Line 190: | Line 190: | ||
*/ | */ | ||
</ | </ | ||
+ | |||
+ | === Details about the PERL String increment feature === | ||
+ | |||
+ | If the string to increment is the empty string, return the string ''" | ||
+ | |||
+ | Otherwise, the last byte of the string is inspected: | ||
+ | * If it is in-between " | ||
+ | * If if is " | ||
+ | * Otherwise, do nothing. | ||
+ | |||
+ | If, and only if, a carry value is held after having inspected the first byte of the string. The string is prepended the character " | ||
+ | |||
+ | Here are a couple examples demonstrating these rules: | ||
+ | <PHP> | ||
+ | <?php | ||
+ | |||
+ | // Empty string | ||
+ | $s = ""; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // String increments are unaware of being " | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // Carrying values of different cases/types | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // Carrying values until the beginning of the string | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // Trailing whitespace | ||
+ | $s = "Z "; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // Leading whitespace | ||
+ | $s = " Z"; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // Whitespace in-between | ||
+ | $s = "C Z"; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // Non-ASCII characters | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // With period | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | // With multiple period | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | </ | ||
+ | |||
+ | The behaviour is slightly different than that of [[https:// | ||
+ | |||
+ | <code raku> | ||
+ | sub var_dump(Str $v) { | ||
+ | say ' | ||
+ | } | ||
+ | |||
+ | # Empty string | ||
+ | my $s = ""; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # String increments are unaware of being " | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # Carrying values of different cases/types | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # Carrying values until the beginning of the string | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # Trailing whitespace | ||
+ | $s = "Z "; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # Leading whitespace | ||
+ | $s = " Z"; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # Whitespace in-between | ||
+ | $s = "C Z"; | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # Non-ASCII characters | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # With period | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | |||
+ | # With multiple period | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | </ | ||
+ | |||
+ | However, the biggest problem is with strings that can be interpreted as a number in scientific notation, because they will never be interpreted as an alphanumeric string to be incremented using the PERL increment feature, but converted to float first: | ||
+ | <PHP> | ||
+ | $s = " | ||
+ | var_dump(++$s); | ||
+ | var_dump(++$s); | ||
+ | </ | ||
+ | |||
+ | While Raku also supports arithmetic operations with strings that represent number in scientific notation, it does not perform any type juggling at all for the increment and decrement operators (therefore having the same behaviour as currently for boolean and its corresponding '' | ||
+ | |||
+ | Therefore the above snippet in Raku gives a consistent result: | ||
+ | <code raku> | ||
+ | sub var_dump(Str $v) { | ||
+ | say ' | ||
+ | } | ||
+ | |||
+ | my $s = " | ||
+ | var_dump(++$s); | ||
+ | var_dump(++$s); | ||
+ | </ | ||
===== Summary of behavioural differences ===== | ===== Summary of behavioural differences ===== | ||
Line 203: | Line 383: | ||
===== Proposal ===== | ===== Proposal ===== | ||
- | The proposal is to create a path so that in the next major version of PHP the increment and decrement operators behave identically to adding/ | + | The proposal is to create a path so that in the next major version of PHP the increment and decrement operators behave identically to adding/ |
To achieve this, we propose the following changes to be made in the next minor version of PHP: | To achieve this, we propose the following changes to be made in the next minor version of PHP: | ||
+ | * Add the < | ||
* Add support to increment/ | * Add support to increment/ | ||
<PHP> | <PHP> | ||
Line 213: | Line 394: | ||
</ | </ | ||
- | * to emit < | + | * to emit < |
<PHP> | <PHP> | ||
$n = null; | $n = null; | ||
Line 233: | Line 414: | ||
- | * Deprecate using those operators | + | * Deprecate using the decrement operator |
<PHP> | <PHP> | ||
$empty = ""; | $empty = ""; | ||
Line 242: | Line 423: | ||
--$s; // Deprecated: Decrement on non-numeric string has no effect and is deprecated | --$s; // Deprecated: Decrement on non-numeric string has no effect and is deprecated | ||
var_dump($s); | var_dump($s); | ||
+ | </ | ||
+ | * Deprecate using the increment operator with strings that are not strictly alphanumeric. | ||
+ | <PHP> | ||
$empty = ""; | $empty = ""; | ||
- | ++$empty // Deprecated: Increment on non-numeric | + | ++$empty // Deprecated: Increment on non-alphanumeric |
var_dump($empty); | var_dump($empty); | ||
+ | $s = " | ||
+ | ++$s; // No Deprecation | ||
+ | var_dump($s); | ||
+ | |||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | |||
+ | $s = "Z "; | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | |||
+ | $s = " Z"; | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | |||
+ | # Non-ASCII characters | ||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | |||
+ | $s = " | ||
+ | ++$s; // Deprecated: Increment on non-alphanumeric string is deprecated | ||
+ | var_dump($s); | ||
+ | </ | ||
+ | |||
+ | In a follow-up minor version of PHP the following changes will take place: | ||
+ | * Deprecate using the increment operator with non-numeric strings. | ||
+ | <PHP> | ||
$s = " | $s = " | ||
++$s; // Deprecated: Increment on non-numeric string is deprecated | ++$s; // Deprecated: Increment on non-numeric string is deprecated | ||
Line 256: | Line 478: | ||
* Non-numeric string values throw a '' | * Non-numeric string values throw a '' | ||
+ | ==== Semantics of str_increment() and str_decrement() ==== | ||
+ | |||
+ | The signature of the functions are: | ||
+ | <PHP> | ||
+ | function str_increment(string $string): string {} | ||
+ | function str_decrement(string $string): string {} | ||
+ | </ | ||
+ | |||
+ | If < | ||
+ | |||
+ | If decrementing < | ||
+ | |||
+ | As those functions would not be performing any type juggling strings that can be interpreted as numbers in scientific notation will not be implicitly converted to float. | ||
+ | |||
+ | <PHP> | ||
+ | $s = " | ||
+ | $s = str_increment($s); | ||
+ | var_dump($s); | ||
+ | $s = str_increment($s); | ||
+ | var_dump($s); | ||
+ | </ | ||
==== Cost/ | ==== Cost/ | ||
- | PHP currently has 6 main and 3 operation-specific type juggling contexts. | + | PHP currently has 6 main and 4 operation-specific type juggling contexts. |
- | The main 6 are documented in the userland manual on the type juggling page and are as follows: | + | The main 6 are documented in the userland manual on the [[https:// |
* Numeric | * Numeric | ||
* String | * String | ||
Line 268: | Line 511: | ||
* Function | * Function | ||
- | The 3 operation-specific contexts are: | + | The 4 operation-specific contexts are: |
* Increment/ | * Increment/ | ||
* String offsets | * String offsets | ||
* Array offsets | * Array offsets | ||
+ | * < | ||
With the semantics proposed in this RFC the increment/ | With the semantics proposed in this RFC the increment/ | ||
- | The drawback of this approach is the deprecation, | + | The drawback of this approach is the deprecation, |
+ | However, the issues around strings that can be interpreted in scientific notation, the fact it only properly supports strings which are only comprised of the ASCII alphanumeric characters | ||
+ | and adding support for string decrements was previously [[rfc: | ||
+ | makes us believe the current semantics of the string increment feature are unsound. | ||
+ | Therefore, we consider the value of reducing the semantic complexity of PHP higher than keeping support for this feature in its current form. | ||
+ | The introduction of the < | ||
<PHP> | <PHP> | ||
- | $s = " | + | function str_increment_polyfill(string |
- | var_dump(++$s); // string(2) " | + | if (is_numeric($s)) { |
- | var_dump(++$s); // string(2) " | + | $offset |
- | $s = "a z"; | + | |
- | var_dump(++$s); // string(3) "a a" | + | |
- | var_dump(++$s); // string(3) "a b" | + | * Therefore we manually increment it to convert it to an "f"/"F" |
- | $s = "a9"; | + | $c = $s[$offset]; |
- | var_dump(++$s); // string(2) " | + | $c++; |
- | var_dump(++$s); // string(2) " | + | $s[$offset] |
- | $s = "a 9"; | + | $s++; |
- | var_dump(++$s); // string(3) "a 0" | + | |
- | var_dump(++$s); // string(3) "a 1" | + | ' |
- | $s = "a é"; | + | ' |
- | var_dump(++$s); // string(4) "a é" | + | ' |
- | var_dump(++$s); // string(4) "a é" | + | ' |
+ | }; | ||
+ | | ||
+ | } | ||
+ | } | ||
+ | return | ||
+ | } | ||
</ | </ | ||
- | Moreover, adding support for string | + | ==== Impact of deprecating the PERL string |
- | Therefore, we consider | + | To determine the impact of this RFC on userland, the static analysis tool [[https:// |
+ | |||
+ | The only non-false-positive use cases using the PERL string increment feature are: | ||
+ | |||
+ | * Generating a list of valid unicode (or ASCII) characters. The most popular project using this is HTMLPurifier, which no longer does so as of [[https://github.com/ezyang/htmlpurifier/ | ||
+ | * Generating sequential IDs. The main library doing this is amphp/amp, however a lot of other projects depend on this library. | ||
+ | * Incrementing a spreadsheet column. | ||
+ | |||
+ | In any of these cases, no deprecation notices would be emitted | ||
+ | As the first stage of this RFC also provides the < | ||
===== Backward Incompatible Changes ===== | ===== Backward Incompatible Changes ===== | ||
Line 310: | Line 574: | ||
One possible future scope is to add support to both arithmetic operations and the increment/ | One possible future scope is to add support to both arithmetic operations and the increment/ | ||
+ | |||
+ | One other possible extension is to add a < | ||
===== Proposed PHP Version ===== | ===== Proposed PHP Version ===== | ||
- | Next minor version, i.e. PHP 8.3.0, and next major version, i.e. PHP 9.0.0. | + | Next minor version, i.e. PHP 8.3.0, follow-up minor version, e.g. PHP 8.4.0, and next major version, i.e. PHP 9.0.0. |
===== Proposed Voting Choices ===== | ===== Proposed Voting Choices ===== | ||
Line 319: | Line 585: | ||
As per the voting RFC a yes/no vote with a 2/3 majority is needed for this proposal to be accepted. | As per the voting RFC a yes/no vote with a 2/3 majority is needed for this proposal to be accepted. | ||
- | Voting started on 2023-XX-XX and will end on 2023-XX-XX. | + | Voting started on 2023-06-28 and will end on 2023-07-12. |
<doodle title=" | <doodle title=" | ||
* Yes | * Yes | ||
Line 327: | Line 593: | ||
===== Implementation ===== | ===== Implementation ===== | ||
- | GitHub pull request: https:// | + | GitHub pull request: |
After the project is implemented, | After the project is implemented, | ||
- | * the version(s) it was merged into | + | * Version: PHP 8.3 |
- | * a link to the git commit(s) | + | * Implementation : |
* a link to the PHP manual entry for the feature | * a link to the PHP manual entry for the feature | ||
===== References ===== | ===== References ===== | ||
rfc/saner-inc-dec-operators.1673920410.txt.gz · Last modified: 2023/01/17 01:53 by girgias