rfc:saner-numeric-strings

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
rfc:saner-numeric-strings [2020/07/15 16:13] – Add missing bitwise example theodorejbrfc:saner-numeric-strings [2020/07/16 13:18] – Enable voting girgias
Line 12: Line 12:
 A string can be categorised in three ways according to its numeric-ness, as [[https://github.com/php/php-langspec/blob/be010b4435e7b0801737bb66b5bbdd8f9fb51dde/spec/05-types.md#the-string-type|described by the language specification]]: A string can be categorised in three ways according to its numeric-ness, as [[https://github.com/php/php-langspec/blob/be010b4435e7b0801737bb66b5bbdd8f9fb51dde/spec/05-types.md#the-string-type|described by the language specification]]:
  
-  * A //numeric string// is a string containing only a [[https://github.com/php/php-langspec/blob/be010b4435e7b0801737bb66b5bbdd8f9fb51dde/spec/05-types.md#grammar-str-number|number]], optionally preceded by white-space characters. For example, <php>"123"</php> or <php>"  1.23e2"</php>+  * A //numeric string// is a string containing only a [[https://github.com/php/php-langspec/blob/be010b4435e7b0801737bb66b5bbdd8f9fb51dde/spec/05-types.md#grammar-str-number|number]], optionally preceded by whitespace characters. For example, <php>"123"</php> or <php>"  1.23e2"</php>
-  * A //leading-numeric string// is a string that begins with a numeric string but is followed by non-number characters  (including white-space characters). For example, <php>"123abc"</php> or <php>"123 "</php>.+  * A //leading-numeric string// is a string that begins with a numeric string but is followed by non-number characters  (including whitespace characters). For example, <php>"123abc"</php> or <php>"123 "</php>.
   * A //non-numeric string// is a string which is neither a numeric string nor a leading-numeric string.   * A //non-numeric string// is a string which is neither a numeric string nor a leading-numeric string.
  
 A fourth way PHP might deal with numeric strings is when using an //integer// string for an array index. A fourth way PHP might deal with numeric strings is when using an //integer// string for an array index.
 An integer string is stricter than a numeric string as it has the following additional constraints: An integer string is stricter than a numeric string as it has the following additional constraints:
-  * It doesn't accept leading white-spaces+  * It doesn't accept leading whitespace
   * It doesn't accept leading zeros (''0'')   * It doesn't accept leading zeros (''0'')
  
Line 27: Line 27:
     "03" => "Integer index with leading 0/octal",     "03" => "Integer index with leading 0/octal",
     "2str" => "leading numeric string",     "2str" => "leading numeric string",
-    " 1" => "leading white-space",+    " 1" => "leading whitespace",
     "5.5" => "Float",     "5.5" => "Float",
 ]; ];
Line 43: Line 43:
   string(22) "leading numeric string"   string(22) "leading numeric string"
   [" 1"]=>   [" 1"]=>
-  string(19) "leading white-space"+  string(19) "leading whitespace"
   ["5.5"]=>   ["5.5"]=>
   string(5) "Float"   string(5) "Float"
Line 101: Line 101:
 var_dump(123 + "string"); // int(123) with E_WARNING "A non-numeric value encountered" var_dump(123 + "string"); // int(123) with E_WARNING "A non-numeric value encountered"
 </PHP> </PHP>
-  * Increment/Decrement operators, i.e. <php>++</php> and <php>--</php>, e.g.<PHP>+  * Increment/decrement operators, i.e. <php>++</php> and <php>--</php>, e.g.<PHP>
 $a = "5"; $a = "5";
 var_dump(++$a); // int(6) var_dump(++$a); // int(6)
Line 134: Line 134:
  
 ===== Proposal ===== ===== Proposal =====
-Unify the various numeric string modes into a single concept: Numeric characters only with both leading and trailing white-spaces allowed. Any other type of string is non-numeric and will throw <php>TypeError</php>s when used in a numeric context.+Unify the various numeric string modes into a single concept: Numeric characters only with both leading and trailing whitespace allowed. Any other type of string is non-numeric and will throw <php>TypeError</php>s when used in a numeric context.
  
-This means, all strings which currently emit the <php>E_NOTICE</php> “A non well formed numeric value encountered” will de reclassified into the <php>E_WARNING</php> “A non-numeric value encountered” //except// if the leading-numeric string contained only trailing white-spaces. And the various cases which currently emit an <php>E_WARNING</php> will be promoted to <php>TypeError</php>s.+This means, all strings which currently emit the <php>E_NOTICE</php> “A non well formed numeric value encountered” will de reclassified into the <php>E_WARNING</php> “A non-numeric value encountered” //except// if the leading-numeric string contained only trailing whitespace. And the various cases which currently emit an <php>E_WARNING</php> will be promoted to <php>TypeError</php>s.
  
 One exception to this are type declarations as they only accept proper numeric strings, thus some <php>E_NOTICE</php> will result in a <php>TypeError</php>. See below for an example. One exception to this are type declarations as they only accept proper numeric strings, thus some <php>E_NOTICE</php> will result in a <php>TypeError</php>. See below for an example.
Line 153: Line 153:
 foo("123abc"); // TypeError foo("123abc"); // TypeError
 </PHP> </PHP>
-  * <php>\is_numeric</php> will return <php>true</php> for numeric strings with trailing white-spaces<PHP> +  * <php>\is_numeric</php> will return <php>true</php> for numeric strings with trailing whitespace<PHP> 
-var_dump(is_numeric("123   "));  // bool(true)+var_dump(is_numeric("123   ")); // bool(true)
 </PHP> </PHP>
   * String offsets<PHP>   * String offsets<PHP>
Line 167: Line 167:
 var_dump(123 + "string"); // TypeError var_dump(123 + "string"); // TypeError
 </PHP> </PHP>
-  * The <php>++</php> and <php>--</php> operators would convert numeric strings with trailing white-space to integers or floats, as appropriate, rather than applying the alphanumeric increment rules<PHP>+  * The <php>++</php> and <php>--</php> operators would convert numeric strings with trailing whitespace to integers or floats, as appropriate, rather than applying the alphanumeric increment rules<PHP>
 $d = "5 "; $d = "5 ";
 var_dump(++$d); // int(6) var_dump(++$d); // int(6)
Line 188: Line 188:
 ===== Backward Incompatible Changes ===== ===== Backward Incompatible Changes =====
 There are three backward incompatible changes: There are three backward incompatible changes:
-  * Code relying on numerical strings with trailing white-spaces to be considered non-well-formed.+  * Code relying on numerical strings with trailing whitespace to be considered non-well-formed.
   * Code with liberal use of leading-numeric strings might need to use explicit type casts.   * Code with liberal use of leading-numeric strings might need to use explicit type casts.
-  * Code relying on the fact that <php>''</php> (an empty string) evaluates to <php>0</php> for arithmetic/bitwise operations+  * Code relying on the fact that <php>''</php> (an empty string) evaluates to <php>0</php> for arithmetic/bitwise operations.
  
 The first reason is a precise requirement and therefore should be checked explicitly. A small poly-fill to check for the previous <php>is_numeric()</php> behaviour: The first reason is a precise requirement and therefore should be checked explicitly. A small poly-fill to check for the previous <php>is_numeric()</php> behaviour:
Line 197: Line 197:
 Breaking the second reason will allow to catch various bugs ahead of time, and the previous behaviour can be obtained by adding explicit casts, e.g.: Breaking the second reason will allow to catch various bugs ahead of time, and the previous behaviour can be obtained by adding explicit casts, e.g.:
 <PHP> <PHP>
-var_dump((int) "2px"); // int(2) +var_dump((int) "2px");     // int(2) 
-var_dump((float) "2px"); // float(2) +var_dump((float) "2px");   // float(2) 
-var_dump((int) "2.5px"); // int(2)+var_dump((int) "2.5px");   // int(2)
 var_dump((float) "2.5px"); // float(2.5) var_dump((float) "2.5px"); // float(2.5)
 </PHP> </PHP>
Line 220: Line 220:
 ===== Future Scope ===== ===== Future Scope =====
   * Nikita Popov's [[rfc:string_to_number_comparison|PHP RFC: Saner string to number comparisons]]   * Nikita Popov's [[rfc:string_to_number_comparison|PHP RFC: Saner string to number comparisons]]
-  * Adding an E_NOTICE for numerical strings with leading/trailing white-spaces +  * Adding an E_NOTICE for numerical strings with leading/trailing whitespace 
-  * Adding a flag to <php>\is_numeric</php> to accept or reject numerical strings with leading/trailing white-spaces+  * Adding a flag to <php>\is_numeric</php> to accept or reject numeric strings with leading/trailing whitespace
   * Align string offset behaviour with array offsets   * Align string offset behaviour with array offsets
   * Promote remaining warnings to Type Errors in PHP 9   * Promote remaining warnings to Type Errors in PHP 9
Line 228: Line 228:
 ===== Proposed Voting Choices ===== ===== Proposed Voting Choices =====
 Per the Voting RFC, there would be a single Yes/No vote requiring a 2/3 majority for the main proposal. A secondary Yes/No vote requiring a 50%+1 majority will decide whether float strings used as string offsets should continue to produce a warning (with different wording) instead of consistently becoming a TypeError. Per the Voting RFC, there would be a single Yes/No vote requiring a 2/3 majority for the main proposal. A secondary Yes/No vote requiring a 50%+1 majority will decide whether float strings used as string offsets should continue to produce a warning (with different wording) instead of consistently becoming a TypeError.
 +
 +Primary vote:
 +<doodle title="Accept Saner numeric string RFC proposal" auth="girgias" voteType="single" closed="false">
 +   * Yes
 +   * No
 +</doodle>
 +
 +Secondary vote:
 +<doodle title="Should valid float strings for string offsets remain a warning" auth="girgias" voteType="single" closed="false">
 +   * Yes
 +   * No
 +</doodle>
  
 ===== Patches and Tests ===== ===== Patches and Tests =====
rfc/saner-numeric-strings.txt · Last modified: 2020/11/25 12:46 by girgias