rfc:binary_string_comparison
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
rfc:binary_string_comparison [2014/07/31 22:10] – mabe | rfc:binary_string_comparison [2017/09/22 13:28] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== PHP RFC: Binary String Comparison ====== | ====== PHP RFC: Binary String Comparison ====== | ||
* Version: 0.1 | * Version: 0.1 | ||
- | * Date: 2014-08-01 | + | * Date: 2014-08-01, internals on 2014-08-17 |
* Author: Marc Bennewitz, php@mabe.berlin | * Author: Marc Bennewitz, php@mabe.berlin | ||
- | * Status: | + | * Status: |
* First Published at: http:// | * First Published at: http:// | ||
- | This RFC proposes to change the behavior of non-strict string to string comparison | + | This RFC proposes to change the behavior of non-strict string to string comparison to be binary safe. |
===== Introduction ===== | ===== Introduction ===== | ||
- | In PHP on comparing two strings | + | In PHP on comparing two strings |
- | This behavior is documented but it' | + | The current |
- | This behavior leads to bugs that are very hard to find and it makes code hard to know what's going on. | + | The current behavior is very unknown in the wold of PHP developers and newcomers because there is no numeric context. Sure there is a note somewhere in the documentation but it is nothing 99.9% of people would expect. |
+ | |||
+ | The current | ||
+ | |||
+ | Using strict string comparison helps to workaround such behavior bit it ends up in using strict comparison all over which makes non-strict comparison useless and some structures like the switch statement can't be used as it internally uses non-strict comparison. | ||
+ | |||
+ | Since PHP 5.2.1 '' | ||
===== Proposal ===== | ===== Proposal ===== | ||
- | This RFC proposes to change the behavior of non-strict string to string comparison | + | This RFC proposes to change the behavior of non-strict string to string comparison to be binary safe (as the strict comparison operator does). |
+ | |||
+ | On comparing two numeric strings both operands will be equal if the string representation will be the same. | ||
+ | On comparing two numeric strings the first operand will be greater if the first not matching byte will be higher. | ||
+ | On comparing two numeric strings the first operand will be lower if the first not matching byte will be lower. | ||
+ | |||
+ | As a side effect it makes string comparison much faster and force developer to really write what they mean (No need to guess) and to force developers to cast/filter input once which also affects performance. | ||
- | On comparing two numeric strings both operands will be equal ONLY if the string representation | + | On C-Level |
- | Example | + | === string == string === |
+ | (http:// | ||
<?php | <?php | ||
- | echo "(' | + | echo (' |
- | echo " | + | echo ('2' == ' |
- | echo "('1e1' | + | echo ('0' == '0x0' ? ' |
- | echo "(' | + | echo ('0' == '00' ? ' |
- | echo "('1e-1' | + | echo (' |
- | echo "('1E-1' | + | echo (' |
- | echo "('+1' | + | echo (' |
- | echo " | + | echo ('1E-1' == '0.1' ? ' |
- | | + | echo (' |
- | echo " | + | echo ('+0' == '-0' ? ' |
- | echo " | + | echo ('0.99999999999999994' |
- | echo "(\"1\\n\" | + | echo ('0.99999999999999995' == '1' ? ' |
+ | echo (" | ||
+ | echo (" | ||
+ | |||
+ | Current Behavior (handle both strings as numbers): | ||
+ | |||
+ | true (' | ||
+ | false (' | ||
+ | true (' | ||
+ | true (' | ||
+ | true (' | ||
+ | true (' | ||
+ | true ('1e-1' == '0.1') | ||
+ | true (' | ||
+ | true (' | ||
+ | true (' | ||
+ | false ('0.99999999999999994' | ||
+ | true ('0.99999999999999995' | ||
+ | true ("\n1" | ||
+ | | ||
+ | |||
+ | Changed Behavior (handle both strings as binary): | ||
+ | |||
+ | true (' | ||
+ | false (' | ||
+ | false (' | ||
+ | false (' | ||
+ | false (' | ||
+ | false (' | ||
+ | false (' | ||
+ | false (' | ||
+ | false (' | ||
+ | false (' | ||
+ | false (' | ||
+ | false (' | ||
+ | false (" | ||
+ | false (" | ||
+ | |||
+ | === string > string | string >= string | string < string | string <= string=== | ||
+ | (http:// | ||
+ | |||
+ | <?php | ||
+ | echo (' | ||
+ | echo (' | ||
+ | echo (' | ||
+ | echo ('1E1' <= ' | ||
+ | echo (' | ||
+ | echo (' | ||
+ | echo (' | ||
+ | echo (' | ||
+ | echo (' | ||
+ | echo (' | ||
+ | |||
+ | Current Behavior (handle both strings as numbers): | ||
+ | |||
+ | false (' | ||
+ | true (' | ||
+ | true (' | ||
+ | true (' | ||
+ | false (' | ||
+ | false (' | ||
+ | false (' | ||
+ | false (' | ||
+ | true (' | ||
+ | true (' | ||
+ | |||
+ | Changed Behavior (handle both strings as binary): | ||
+ | |||
+ | false (' | ||
+ | true (' | ||
+ | false (' | ||
+ | false (' | ||
+ | true (' | ||
+ | true (' | ||
+ | true ('0.99999999999999995' | ||
+ | false (' | ||
+ | true (' | ||
+ | false (' | ||
+ | |||
+ | === binary marked strings (since PHP 5.2.1) === | ||
+ | (http:// | ||
+ | |||
+ | <?php | ||
+ | var_dump((binary)'1e1' | ||
+ | var_dump(b' | ||
+ | |||
+ | Current Behavior (binary marked strings will be handled numerically): | ||
+ | |||
+ | bool(true) | ||
+ | bool(true) | ||
+ | |||
+ | Changed Behavior (all strings will be handled binary without a context): | ||
+ | |||
+ | bool(false) | ||
+ | bool(false) | ||
+ | |||
+ | === sorting of strings === | ||
+ | (http:// | ||
+ | |||
+ | <?php | ||
+ | |||
+ | $arr = array(' | ||
+ | |||
+ | echo "Sort regular:\n"; | ||
+ | sort($arr); | ||
+ | var_dump($arr); | ||
+ | |||
+ | echo "Sort numeric: | ||
+ | sort($arr, SORT_NUMERIC); | ||
+ | var_dump($arr); | ||
+ | |||
+ | echo "Sort binary: | ||
+ | sort($arr, SORT_STRING); | ||
+ | var_dump($arr); | ||
Current Behavior: | Current Behavior: | ||
- | (' | + | |
- | ('2' | + | array(6) { |
- | (' | + | [0] => |
- | (' | + | |
- | (' | + | [1] => |
- | (' | + | string(1) " |
- | ('+1' | + | [2] => |
- | (' | + | |
- | (' | + | [3] => |
- | ('0.99999999999999995' | + | int(2) |
- | ("\n1") == ' | + | [4] => |
- | ("1\n") == ' | + | int(3) |
+ | [5] => | ||
+ | string(2) " | ||
+ | } | ||
+ | Sort numeric: | ||
+ | | ||
+ | [0] => | ||
+ | | ||
+ | [1] => | ||
+ | string(1) " | ||
+ | [2] => | ||
+ | int(2) | ||
+ | [3] => | ||
+ | | ||
+ | [4] => | ||
+ | string(2) " | ||
+ | [5] => | ||
+ | int(3) | ||
+ | | ||
+ | Sort binary: | ||
+ | array(6) { | ||
+ | [0] => | ||
+ | string(2) " | ||
+ | [1] => | ||
+ | | ||
+ | [2] => | ||
+ | string(2) " | ||
+ | [3] => | ||
+ | | ||
+ | [4] => | ||
+ | int(2) | ||
+ | [5] => | ||
+ | int(3) | ||
+ | } | ||
Changed Behavior: | Changed Behavior: | ||
- | (' | + | |
- | ('2' | + | array(6) { |
- | (' | + | [0]=> |
- | (' | + | string(2) " |
- | (' | + | [1]=> |
- | (' | + | |
- | ('+1' | + | [2]=> |
- | (' | + | string(1) " |
- | (' | + | [3]=> |
- | ('0.99999999999999995' | + | int(2) |
- | ("\n1") == ' | + | [4]=> |
- | ("1\n") == ' | + | int(3) |
+ | [5]=> | ||
+ | | ||
+ | | ||
+ | Sort numeric: | ||
+ | array(6) { | ||
+ | [0]=> | ||
+ | | ||
+ | [1]=> | ||
+ | string(1) " | ||
+ | [2]=> | ||
+ | | ||
+ | [3]=> | ||
+ | int(2) | ||
+ | [4]=> | ||
+ | string(2) " | ||
+ | [5]=> | ||
+ | int(3) | ||
+ | | ||
+ | Sort binary: | ||
+ | array(6) { | ||
+ | [0]=> | ||
+ | string(2) " | ||
+ | [1]=> | ||
+ | | ||
+ | [2]=> | ||
+ | string(2) " | ||
+ | [3]=> | ||
+ | | ||
+ | [4]=> | ||
+ | int(2) | ||
+ | [5]=> | ||
+ | int(3) | ||
+ | } | ||
===== Backward Incompatible Changes ===== | ===== Backward Incompatible Changes ===== | ||
- | Existing code that relies on the current behavior will only produce the originally expected result if the string representation is the same. This can be easily resolved by explicitly casting one of the operands to an integer or float. | + | Existing code that relies on the current behavior |
===== Proposed PHP Version(s) ===== | ===== Proposed PHP Version(s) ===== | ||
As this is a backwards-incompatible change, this RFC targets PHP.next. | As this is a backwards-incompatible change, this RFC targets PHP.next. | ||
- | ===== Open Issues ===== | + | ===== Affected |
- | How to note behavior change? | + | |
- | ... Is it enough to note it in the change-log or is it possible to trigger a E_DEPRECATED/ | + | |
- | + | ||
- | ===== Unaffected | + | |
Only non-strict string to string comparison will be affected. | Only non-strict string to string comparison will be affected. | ||
- | + | Means the operators '' | |
- | ===== Future Scope ===== | + | |
- | This sections details areas where the feature might be improved in future, but that are not currently proposed in this RFC. | + | |
===== Proposed Voting Choices ===== | ===== Proposed Voting Choices ===== | ||
Line 89: | Line 275: | ||
===== Patches and Tests ===== | ===== Patches and Tests ===== | ||
- | Coming soon | + | https:// |
===== Implementation ===== | ===== Implementation ===== | ||
Line 98: | Line 284: | ||
===== References ===== | ===== References ===== | ||
+ | * http:// | ||
+ | * http:// | ||
+ | * http:// | ||
+ | * http:// | ||
+ | * https:// | ||
===== Rejected Features ===== | ===== Rejected Features ===== | ||
None so far. | None so far. |
rfc/binary_string_comparison.1406844612.txt.gz · Last modified: 2017/09/22 13:28 (external edit)