rfc:saner-numeric-strings

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
rfc:saner-numeric-strings [2020/07/14 15:19]
theodorejb Improve wording
rfc:saner-numeric-strings [2020/08/01 23:37]
carusogabriel RFC was implemented
Line 2: Line 2:
   * Version: 1.4   * Version: 1.4
   * Date: 2020-06-28   * Date: 2020-06-28
 +  * Original Author: Andrea Faulds <ajf@ajf.me>
 +  * Original RFC: [[http://wiki.php.net/rfc/trailing_whitespace_numerics|PHP RFC: Permit trailing whitespace in numeric strings]]
   * Author: George Peter Banyard <girgias@php.net>   * Author: George Peter Banyard <girgias@php.net>
-  * Status: Under Discussion+  * Status: Implemented
   * First Published at: http://wiki.php.net/rfc/saner-numeric-strings   * First Published at: http://wiki.php.net/rfc/saner-numeric-strings
   * Implementation: https://github.com/php/php-src/pull/5762   * Implementation: https://github.com/php/php-src/pull/5762
Line 12: Line 14:
 A string can be categorised in three ways according to its numeric-ness, as [[https://github.com/php/php-langspec/blob/be010b4435e7b0801737bb66b5bbdd8f9fb51dde/spec/05-types.md#the-string-type|described by the language specification]]: A string can be categorised in three ways according to its numeric-ness, as [[https://github.com/php/php-langspec/blob/be010b4435e7b0801737bb66b5bbdd8f9fb51dde/spec/05-types.md#the-string-type|described by the language specification]]:
  
-  * A //numeric string// is a string containing only a [[https://github.com/php/php-langspec/blob/be010b4435e7b0801737bb66b5bbdd8f9fb51dde/spec/05-types.md#grammar-str-number|number]], optionally preceded by white-space characters. For example, <php>"123"</php> or <php>"  1.23e2"</php>+  * A //numeric string// is a string containing only a [[https://github.com/php/php-langspec/blob/be010b4435e7b0801737bb66b5bbdd8f9fb51dde/spec/05-types.md#grammar-str-number|number]], optionally preceded by whitespace characters. For example, <php>"123"</php> or <php>"  1.23e2"</php>
-  * A //leading-numeric string// is a string that begins with a numeric string but is followed by non-number characters  (including white-space characters). For example, <php>"123abc"</php> or <php>"123 "</php>.+  * A //leading-numeric string// is a string that begins with a numeric string but is followed by non-number characters  (including whitespace characters). For example, <php>"123abc"</php> or <php>"123 "</php>.
   * A //non-numeric string// is a string which is neither a numeric string nor a leading-numeric string.   * A //non-numeric string// is a string which is neither a numeric string nor a leading-numeric string.
  
 A fourth way PHP might deal with numeric strings is when using an //integer// string for an array index. A fourth way PHP might deal with numeric strings is when using an //integer// string for an array index.
 An integer string is stricter than a numeric string as it has the following additional constraints: An integer string is stricter than a numeric string as it has the following additional constraints:
-  * It doesn't accept leading white-spaces+  * It doesn't accept leading whitespace
   * It doesn't accept leading zeros (''0'')   * It doesn't accept leading zeros (''0'')
  
Line 27: Line 29:
     "03" => "Integer index with leading 0/octal",     "03" => "Integer index with leading 0/octal",
     "2str" => "leading numeric string",     "2str" => "leading numeric string",
-    " 1" => "leading white-space",+    " 1" => "leading whitespace",
     "5.5" => "Float",     "5.5" => "Float",
 ]; ];
Line 43: Line 45:
   string(22) "leading numeric string"   string(22) "leading numeric string"
   [" 1"]=>   [" 1"]=>
-  string(19) "leading white-space"+  string(19) "leading whitespace"
   ["5.5"]=>   ["5.5"]=>
   string(5) "Float"   string(5) "Float"
Line 101: Line 103:
 var_dump(123 + "string"); // int(123) with E_WARNING "A non-numeric value encountered" var_dump(123 + "string"); // int(123) with E_WARNING "A non-numeric value encountered"
 </PHP> </PHP>
-  * Increment/Decrement operators, i.e. <php>++</php> and <php>--</php>, e.g.<PHP>+  * Increment/decrement operators, i.e. <php>++</php> and <php>--</php>, e.g.<PHP>
 $a = "5"; $a = "5";
 var_dump(++$a); // int(6) var_dump(++$a); // int(6)
Line 119: Line 121:
   * Bitwise operations, e.g.<PHP>   * Bitwise operations, e.g.<PHP>
 var_dump(123 & "123");    // int(123) var_dump(123 & "123");    // int(123)
 +var_dump(123 & "  123");  // int(123)
 var_dump(123 & "123  ");  // int(123) with E_NOTICE "A non well formed numeric value encountered" var_dump(123 & "123  ");  // int(123) with E_NOTICE "A non well formed numeric value encountered"
 var_dump(123 & "123abc"); // int(123) with E_NOTICE "A non well formed numeric value encountered" var_dump(123 & "123abc"); // int(123) with E_NOTICE "A non well formed numeric value encountered"
Line 133: Line 136:
  
 ===== Proposal ===== ===== Proposal =====
-Unify the various numeric string modes into a single concept: Numeric characters only with both leading and trailing white-spaces allowed. Any other type of string is non-numeric and will throw <php>TypeError</php>s when used in a numeric context.+Unify the various numeric string modes into a single concept: Numeric characters only with both leading and trailing whitespace allowed. Any other type of string is non-numeric and will throw <php>TypeError</php>s when used in a numeric context.
  
-This means, all strings which currently emit the <php>E_NOTICE</php> “A non well formed numeric value encountered” will de reclassified into the <php>E_WARNING</php> “A non-numeric value encountered” //except// if the leading-numeric string contained only trailing white-spaces. And the various cases which currently emit an <php>E_WARNING</php> will be promoted to <php>TypeError</php>s.+This means, all strings which currently emit the <php>E_NOTICE</php> “A non well formed numeric value encountered” will be reclassified into the <php>E_WARNING</php> “A non-numeric value encountered” //except// if the leading-numeric string contained only trailing whitespace. And the various cases which currently emit an <php>E_WARNING</php> will be promoted to <php>TypeError</php>s.
  
 One exception to this are type declarations as they only accept proper numeric strings, thus some <php>E_NOTICE</php> will result in a <php>TypeError</php>. See below for an example. One exception to this are type declarations as they only accept proper numeric strings, thus some <php>E_NOTICE</php> will result in a <php>TypeError</php>. See below for an example.
Line 143: Line 146:
   * Leading numeric strings will emit the “Illegal string offset” warning instead of the “A non well formed numeric value encountered” notice, and continue to evaluate to their respective values.   * Leading numeric strings will emit the “Illegal string offset” warning instead of the “A non well formed numeric value encountered” notice, and continue to evaluate to their respective values.
   * Non-numeric strings which emitted the “Illegal string offset” warning will throw an “Illegal offset type” TypeError.   * Non-numeric strings which emitted the “Illegal string offset” warning will throw an “Illegal offset type” TypeError.
-  * There is a secondary implementation vote to decide the following: should numeric strings which correspond to well-formed floats remain a warning (by emitting the same “String offset cast occurred” warning that occurs when a float is used for a string offset), or should the current “Illegal string offset” warning simply be promoted to a <php>TypeError</php>? Our position is that this case should be a TypeError, as it simplifies the implementation and is consistent with the handling of other strings (see this [[https://github.com/php/php-src/pull/5762/commits/788a6963c1343d53dadc23fb2983224be9ba4c04|commit]]).+  * There is a secondary implementation vote to decide the following: should numeric strings which correspond to well-formed floats remain a warning (by emitting the same “String offset cast occurred” warning that occurs when a float is used for a string offset), or should the current “Illegal string offset” warning simply be promoted to a <php>TypeError</php>? Our position is that this case should be a TypeError, as it simplifies the implementation and is consistent with the handling of other strings (see this [[https://github.com/php/php-src/pull/5762/commits/897c37727b1ee393f04f57a88fc48d69c3cf0d1d|commit]]).
  
  
Line 152: Line 155:
 foo("123abc"); // TypeError foo("123abc"); // TypeError
 </PHP> </PHP>
-  * <php>\is_numeric</php> will return <php>true</php> for numeric strings with trailing white-spaces<PHP> +  * <php>\is_numeric</php> will return <php>true</php> for numeric strings with trailing whitespace<PHP> 
-var_dump(is_numeric("123   "));  // bool(true)+var_dump(is_numeric("123   ")); // bool(true)
 </PHP> </PHP>
   * String offsets<PHP>   * String offsets<PHP>
Line 166: Line 169:
 var_dump(123 + "string"); // TypeError var_dump(123 + "string"); // TypeError
 </PHP> </PHP>
-  * The <php>++</php> and <php>--</php> operators would convert numeric strings with trailing white-space to integers or floats, as appropriate, rather than applying the alphanumeric increment rules<PHP>+  * The <php>++</php> and <php>--</php> operators would convert numeric strings with trailing whitespace to integers or floats, as appropriate, rather than applying the alphanumeric increment rules<PHP>
 $d = "5 "; $d = "5 ";
 var_dump(++$d); // int(6) var_dump(++$d); // int(6)
Line 187: Line 190:
 ===== Backward Incompatible Changes ===== ===== Backward Incompatible Changes =====
 There are three backward incompatible changes: There are three backward incompatible changes:
-  * Code relying on numerical strings with trailing white-spaces to be considered non-well-formed.+  * Code relying on numerical strings with trailing whitespace to be considered non-well-formed.
   * Code with liberal use of leading-numeric strings might need to use explicit type casts.   * Code with liberal use of leading-numeric strings might need to use explicit type casts.
-  * Code relying on the fact that <php>''</php> (an empty string) evaluates to <php>0</php> for arithmetic/bitwise operations+  * Code relying on the fact that <php>''</php> (an empty string) evaluates to <php>0</php> for arithmetic/bitwise operations.
  
 The first reason is a precise requirement and therefore should be checked explicitly. A small poly-fill to check for the previous <php>is_numeric()</php> behaviour: The first reason is a precise requirement and therefore should be checked explicitly. A small poly-fill to check for the previous <php>is_numeric()</php> behaviour:
Line 196: Line 199:
 Breaking the second reason will allow to catch various bugs ahead of time, and the previous behaviour can be obtained by adding explicit casts, e.g.: Breaking the second reason will allow to catch various bugs ahead of time, and the previous behaviour can be obtained by adding explicit casts, e.g.:
 <PHP> <PHP>
-var_dump((int) "2px"); // int(2) +var_dump((int) "2px");     // int(2) 
-var_dump((float) "2px"); // float(2) +var_dump((float) "2px");   // float(2) 
-var_dump((int) "2.5px"); // int(2)+var_dump((int) "2.5px");   // int(2)
 var_dump((float) "2.5px"); // float(2.5) var_dump((float) "2.5px"); // float(2.5)
 </PHP> </PHP>
Line 219: Line 222:
 ===== Future Scope ===== ===== Future Scope =====
   * Nikita Popov's [[rfc:string_to_number_comparison|PHP RFC: Saner string to number comparisons]]   * Nikita Popov's [[rfc:string_to_number_comparison|PHP RFC: Saner string to number comparisons]]
-  * Adding an E_NOTICE for numerical strings with leading/trailing white-spaces +  * Adding an E_NOTICE for numerical strings with leading/trailing whitespace 
-  * Adding a flag to <php>\is_numeric</php> to accept or reject numerical strings with leading/trailing white-spaces+  * Adding a flag to <php>\is_numeric</php> to accept or reject numeric strings with leading/trailing whitespace
   * Align string offset behaviour with array offsets   * Align string offset behaviour with array offsets
   * Promote remaining warnings to Type Errors in PHP 9   * Promote remaining warnings to Type Errors in PHP 9
   * Warn on illegal offsets when used within <php>isset()</php> or <php>empty()</php>   * Warn on illegal offsets when used within <php>isset()</php> or <php>empty()</php>
  
-===== Proposed Voting Choices ===== +===== Vote ===== 
-Per the Voting RFC, there would be a single Yes/No vote requiring a 2/3 majority for the main proposal. A secondary Yes/No vote requiring a 50%+1 majority will decide whether float strings used as string offsets should continue to produce a warning (with different wording) instead of consistently becoming a TypeError.+Per the Voting RFC, there is a single Yes/No vote requiring a 2/3 majority for the main proposal. A secondary Yes/No vote requiring a 50%+1 majority will decide whether float strings used as string offsets should continue to produce a warning (with different wording) instead of consistently becoming a TypeError. 
 + 
 +Primary vote: 
 +<doodle title="Accept Saner numeric string RFC proposal" auth="girgias" voteType="single" closed="true"> 
 +   * Yes 
 +   * No 
 +</doodle> 
 + 
 +Secondary vote: 
 +<doodle title="Should valid float strings for string offsets remain a warning" auth="girgias" voteType="single" closed="true"> 
 +   * Yes 
 +   * No 
 +</doodle>
  
 ===== Patches and Tests ===== ===== Patches and Tests =====
rfc/saner-numeric-strings.txt · Last modified: 2020/11/25 12:46 by girgias