rfc:string_to_number_comparison

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
rfc:string_to_number_comparison [2019/02/26 12:13] – Add PR link nikicrfc:string_to_number_comparison [2020/07/31 12:55] (current) nikic
Line 2: Line 2:
   * Date: 2019-02-26   * Date: 2019-02-26
   * Author: Nikita Popov <nikic@php.net>   * Author: Nikita Popov <nikic@php.net>
-  * Status: Under Discussion+  * Status: Implemented
   * Target Version: PHP 8.0   * Target Version: PHP 8.0
   * Implementation: https://github.com/php/php-src/pull/3886   * Implementation: https://github.com/php/php-src/pull/3886
Line 8: Line 8:
 ===== Introduction ===== ===== Introduction =====
  
-Comparisons between strings and numbers using ''=='' and other non-strict comparison operators currently work by casting the string to a number, and subsequently performing a comparison on integers or floats. This results in many surprising comparison results, the most notable of which is that ''%%0 == "foobar"%%'' returns true. This RFC proposes to make non-strict comparisons more useful and less error prone, by using a number comparison only if the string is actually numeric. Otherwise the number is converted to a string, and a string comparison is performed.+Comparisons between strings and numbers using ''=='' and other non-strict comparison operators currently work by casting the string to a number, and subsequently performing a comparison on integers or floats. This results in many surprising comparison results, the most notable of which is that ''%%0 == "foobar"%%'' returns true. This RFC proposes to make non-strict comparisons more useful and less error prone, by using a number comparison only if the string is actually numeric. Otherwise the number is converted into a string, and a string comparison is performed.
  
 PHP supports two different types of comparison operators: The strict comparisons ''==='' and ''!=='', and the non-strict comparisons ''=='', ''!='', ''>'', ''>='', ''<'', ''%%<=%%'' and ''%%<=>%%''. The primary difference between them is that strict comparisons require both operands to be of the same type, and do not perform implicit type coercions. However, there are some additional differences: PHP supports two different types of comparison operators: The strict comparisons ''==='' and ''!=='', and the non-strict comparisons ''=='', ''!='', ''>'', ''>='', ''<'', ''%%<=%%'' and ''%%<=>%%''. The primary difference between them is that strict comparisons require both operands to be of the same type, and do not perform implicit type coercions. However, there are some additional differences:
Line 118: Line 118:
                          // Before | After | Type                          // Before | After | Type
 var_dump(42 == "   42"); // true   | true  | well-formed var_dump(42 == "   42"); // true   | true  | well-formed
-var_dump(42 == "42   "); // true   | false | non well-formed+var_dump(42 == "42   "); // true   | false | non well-formed (*)
 var_dump(42 == "42abc"); // true   | false | non well-formed var_dump(42 == "42abc"); // true   | false | non well-formed
 var_dump(42 == "abc42"); // false  | false | non-numeric var_dump(42 == "abc42"); // false  | false | non-numeric
 var_dump( 0 == "abc42"); // true   | false | non-numeric var_dump( 0 == "abc42"); // true   | false | non-numeric
 +// (*) Becomes well-formed if saner numeric strings RFC passes
 </code> </code>
  
-A notable asymmetry under the new semantics is that ''%%"   42"%%'' and ''%%"42   "%%'' compare differently. In my opinion both of these should behave the same and ''%%42 == "42   "%%'' should return true. There is a draft RFC [[rfc:trailing_whitespace_numerics|to allow trailing whitespace]] in numeric strings, which would resolve this issue.+A notable asymmetry under the new semantics is that ''%%"   42"%%'' and ''%%"42   "%%'' compare differently. This inconsistency is being addressed by the [[rfc:saner-numeric-strings|saner numeric strings RFC]].
  
-==== Precision and locale ====+==== Precision ====
  
-The reason why the comparison semantics are not simply defined in terms of casting the number to string and performing a non-strict string comparison (even though that is a good way to think about it for most purposes), is that floating-point to string conversions in PHP are subject to two runtime settings: The ''precision'' ini directive, and the decimal separator specified by the active locale.+The reason why the comparison semantics are not simply defined in terms of casting the number to string and performing a non-strict string comparison (even though that is a good way to think about it for most purposes), is that floating-point to string conversions in PHP are subject to the ''precision'' ini directive.
  
-Comparisons with well-formed numeric strings are handled separately to be independent of these runtime settings. However, these settings do have an effect for the case where we fall back to binary string comparison. +Comparisons with well-formed numeric strings are handled separately to be independent of this runtime setting. However, it does have an effect if we fall back to binary string comparison. For example:
- +
-An example of the effect of the ''precision'' ini directive:+
  
 <code php> <code php>
Line 148: Line 147:
 </code> </code>
  
-An example of the effect of ''setlocale()'': +An alternative approach to this issue would be to define that the float to string conversion used for comparisons always uses automatically determined precision (''precision=-1'').
- +
-<code php> +
-$float = 1.75; +
- +
-var_dump($float < "1.6abc"); +
-// Behaves like +
-var_dump("1.75" < "1.6abc"); // false +
- +
-setlocale(LC_NUMERIC, 'de_DE.UTF-8'); +
-var_dump($float < "1.6abc"); +
-// Behaves like +
-var_dump("1,75" < "1.6abc"); // true +
-</code> +
- +
-It should be mentioned that the setlocale() dependence [[https://externals.io/message/103638|may go away]] in PHP 8, but this hasn't been formally proposed yet. +
- +
-An alternative approach to this issue would be to define that the float to string conversion used for comparisons does not respective the locale and always uses automatically determined precision (''precision=-1'').+
  
 ==== Special values ==== ==== Special values ====
Line 190: Line 172:
 This change to the semantics of non-strict comparisons is backwards incompatible. Worse, it constitutes a silent change in core language semantics. Code that worked one way in PHP 7.4 will work differently in PHP 8.0. Use of static analysis to detect cases that may be affected is likely to yield many false positives. This change to the semantics of non-strict comparisons is backwards incompatible. Worse, it constitutes a silent change in core language semantics. Code that worked one way in PHP 7.4 will work differently in PHP 8.0. Use of static analysis to detect cases that may be affected is likely to yield many false positives.
  
-One possible way to mitigate the impact is to introduce an ini setting in PHP 7.4which will perform the comparison using both the old and the new method and emit a deprecation warning if the results differ. This would allow identifying affected code based on production logs.+Testing with [[https://github.com/php/php-src/pull/3917|a warning on comparison result change]] suggests that the practical impact of this change is much lower than one might intuitively expectbut this likely heavily depends on the type of tested codebase.
  
 ===== Vote ===== ===== Vote =====
  
-TBD+Voting starts 2020-07-17 and ends 2020-07-31. A 2/3 majority is required. 
 + 
 +<doodle title="Change string to number comparison semantics as proposed?" auth="nikic" voteType="single" closed="true"> 
 +   * Yes 
 +   * No 
 +</doodle>
  
rfc/string_to_number_comparison.1551183197.txt.gz · Last modified: 2019/02/26 12:13 by nikic