rfc:null_coercion_consistency

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
rfc:null_coercion_consistency [2022/05/05 11:21] – Add note about arithmetics craigfrancisrfc:null_coercion_consistency [2022/10/16 07:37] – Note that Rector can now "help" with this problem. craigfrancis
Line 45: Line 45:
   - [[https://www.php.net/manual/en/language.types.boolean.php|To Boolean]]: "When converting to bool, the following values are considered false [...] the special type NULL"   - [[https://www.php.net/manual/en/language.types.boolean.php|To Boolean]]: "When converting to bool, the following values are considered false [...] the special type NULL"
  
-For example:+==== Current State ====
  
 <code php> <code php>
Line 106: Line 106:
 </code> </code>
  
-Arrays, Resources, and Objects (without //__toString()//) cannot be coerced (for fairly obvious reasons).+Arrays, Resources, and Objects (without toString) cannot be coerced (for fairly obvious reasons).
  
 String/Int/Float/Bool can be coerced. String/Int/Float/Bool can be coerced.
Line 115: Line 115:
   - PHP 8.1 updated internal function parameters to work in the same way.   - PHP 8.1 updated internal function parameters to work in the same way.
  
-==== Examples ====+==== Scalar Types ====
  
-A simple search page:+[[https://news-web.php.net/php.internals/117523|George Peter Banyard]] notes that "Userland scalar types [...] did not include coercion from NULL for //very// good reasons". The only reason mentioned in [[https://wiki.php.net/rfc/scalar_type_hints_v5|Scalar Type Declarations]] is "to be consistent with our existing type declarations" (no further details given).
  
-<code php> +The RFC also says "it should be possible for existing userland libraries to add scalar type declarations without breaking compatibility", but this is not the case, because of NULL. This has made adoption of type declarations harder, as it does not work like the following:
-$name = $request->get('name');+
  
-if (trim($name) === ''{ // Contains non-whitespace charactersso not "", " ", NULL, etc +<code php> 
-  $where_sql[] 'name LIKE ?'+function my_function($s, $i, $f, $b) { 
-  $where_val[] = $name;+  $s = strval($s)
 +  $i intval($i); 
 +  $floatval($f)
 +  $boolval($b); 
 +  var_dump($s, $i, $f, $b);
 } }
  
-echo ' +function my_function(string $s, int $i, float $f, bool $b) { 
-  <form action="./" method="get"> +  var_dump($s, $i, $f, $b);
-    <label> +
-      Search +
-      <input type="search" name="name" value="' . htmlspecialchars($name) . '"> +
-    </label> +
-    <input type="submit" value="Go"> +
-  </form>'; +
- +
-if ($name !== NULL) { +
-  $register_url = '/admin/accounts/add/?name=' . urlencode($name); +
-  echo ' +
-    <p><a href="' . htmlspecialchars($register_url. '">Add Account</a></p>';+
 } }
 +
 +my_function(NULL, NULL, NULL, NULL);
 </code> </code>
  
-Note how line 1 cannot simply be updated to force //$name// to be non-nullableas it would break the "Add Account" link. But does passing NULL to //htmlspecialchars()// and //trim()// justify a Fatal Type Error?+Some developers view NULL as a missing/invalid valueand passing NULL to a function like //htmlspecialchars()// could indicate a problem (can a be useful check for static analysis, or in the context of //strict_types=1//).
  
-I'd argue this very strict level of type checking is best done by Static Analysis, which can check if a variable can be nullable, and it can decide if this is a problem, in the same way that a string (e.g. '15') being provided to integer parameter could be seen as a problem.+==== Examples ====
  
 Common sources of NULL: Common sources of NULL:
Line 161: Line 155:
 $search = $request->getGet('q'); // CodeIgniter $search = $request->getGet('q'); // CodeIgniter
  
-$value = array_pop($empty_array); 
 $value = mysqli_fetch_row($result); $value = mysqli_fetch_row($result);
 $value = json_decode($json); // Invalid JSON, or nesting limit. $value = json_decode($json); // Invalid JSON, or nesting limit.
 +$value = array_pop($empty_array);
 </code> </code>
  
Line 169: Line 163:
  
 <code php> <code php>
 +$rounded_value = round($value);
 +
 $search_trimmed = trim($search); $search_trimmed = trim($search);
  
Line 199: Line 195:
 mail('nobody@example.com', 'subject', 'message', NULL, '-fwebmaster@example.com'); mail('nobody@example.com', 'subject', 'message', NULL, '-fwebmaster@example.com');
 </code> </code>
 +
 +HTML Templating engines like [[https://github.com/laravel/framework/blob/ab1506091b9f166b312b3990d07b2e21d971f2e6/src/Illuminate/Support/helpers.php#L119|Laravel Blade]] suppress this deprecation with null-coalescing ([[https://github.com/laravel/framework/pull/36262/files#diff-15b0a3e2eb2d683222d19dfacc04c616a3db4e3d3b3517e96e196ccbf838f59eR118|patch]]); or [[https://github.com/twigphp/Twig/blob/b4d6723715da57667cca851051eba3786714290d/src/Extension/EscaperExtension.php#L195|Symphony Twig]] which preserves NULL, but it's often passed to //echo// (which accepts it, despite the [[https://www.php.net/echo|echo documentation]] saying it accepts non-nullable strings).
 +
 +I'd argue a very strict level of type checking (that prevents all forms of coercion) is best done by Static Analysis, which can check if a variable can be nullable, and it can decide if this is a problem, in the same way that a string (e.g. '15') being provided to integer parameter could be seen as a problem.
  
 There are approximately [[https://github.com/craigfrancis/php-allow-null-rfc/blob/main/functions-change.md|335 parameters affected by this deprecation]]. There are approximately [[https://github.com/craigfrancis/php-allow-null-rfc/blob/main/functions-change.md|335 parameters affected by this deprecation]].
Line 210: Line 210:
 It is possible to use very strict Static Analysis, to follow every variable from source to sink (to check if a variable could be NULL), but most developers are not in a position to do this (i.e. not using static analysis, or not at a high enough level, or they are using a baseline to ignore). It is possible to use very strict Static Analysis, to follow every variable from source to sink (to check if a variable could be NULL), but most developers are not in a position to do this (i.e. not using static analysis, or not at a high enough level, or they are using a baseline to ignore).
  
-In the last JetBrains developer survey (with 67% regularly using Laravel), **only 33% used Static Analysis** ([[https://www.jetbrains.com/lp/devecosystem-2021/php/#PHP_do-you-use-static-analysis|source]]); where it's fair to many still would still not be identify these possible NULL values (too low level, and/or using a baseline).+In the last JetBrains developer survey (with 67% regularly using Laravel), **only 33% used Static Analysis** ([[https://www.jetbrains.com/lp/devecosystem-2021/php/#PHP_do-you-use-static-analysis|source]]); where it's fair to say many of these developers would //still// not identify these possible NULL values (too low level, and/or using a baseline).
  
 As an example, take this simple script: As an example, take this simple script:
Line 222: Line 222:
 </code> </code>
  
-Even that is considered fine today by the relevant tools: +This is considered fine by these tools:
- +
-<code cli> +
-composer require --dev rector/rector +
-./vendor/bin/rector init +
-./vendor/bin/rector process ./src/ +
-[OK] Rector is done! +
-</code>+
  
 <code cli> <code cli>
Line 294: Line 287:
 </code> </code>
 Note: Psalm can detect this at [[https://psalm.dev/docs/running_psalm/error_levels/|levels 1, 2, and 3]] (don't use a baseline). Note: Psalm can detect this at [[https://psalm.dev/docs/running_psalm/error_levels/|levels 1, 2, and 3]] (don't use a baseline).
 +
 +==== One Solution ====
 +
 +Since [[https://github.com/rectorphp/rector-src/pull/2543|21st June 2022]], Rector can modify 362 function arguments via //NullToStrictStringFuncCallArgRector//:
 +
 +<code cli>
 +mkdir -p rector/src;
 +
 +cd rector/;
 +
 +composer require --dev rector/rector;
 +
 +echo '<?= htmlspecialchars($var) ?>' > src/index.php;
 +
 +echo '<?php
 +
 +use Rector\Php81\Rector\FuncCall\NullToStrictStringFuncCallArgRector;
 +use Rector\Config\RectorConfig;
 +
 +return static function (RectorConfig $rectorConfig): void {
 +    $rectorConfig->paths([__DIR__ . "/src"]);
 +    $rectorConfig->rule(NullToStrictStringFuncCallArgRector::class);
 +};
 +' > rector.php;
 +
 +./vendor/bin/rector process;
 +</code>
 +
 +This will litter the code with the use of //(string)// type casting, e.g.
 +
 +<code diff>
 +-<?= htmlspecialchars($var) ?>
 ++<?= htmlspecialchars((string) $var) ?>
 +</code>
 +
 +For a typical project (which won't be using strict_types), expect thousands of changes to be made; and note how this does not improve code quality.
  
 ==== Temporary Solutions ==== ==== Temporary Solutions ====
Line 329: Line 358:
 </code> </code>
  
-As noted above - PHPCompatibility, CodeSniffer, Rector, etc are unable to find or update these cases.+As noted above - Rector can add //(string)// type casting automaticallybut I have no idea how this improves code quality.
  
 ===== Proposal ===== ===== Proposal =====
Line 341: Line 370:
 ===== Backward Incompatible Changes ===== ===== Backward Incompatible Changes =====
  
-None+While the intention of this RFC is to avoid a BC break; for user defined functions to be updated to also coerce NULL (instead of throwing a Type Error), it's possible some code may rely on that behaviour, for example: 
 + 
 +<code php> 
 +function my_function(string $my_string) { 
 +  var_dump($my_string); 
 +
 + 
 +try { 
 +  my_function('A');   // string(1) "A" 
 +  my_function(1);     // string(1) "1" 
 +  my_function(1.2);   // string(3) "1.2" 
 +  my_function(true);  // string(1) "1" 
 +  my_function(false); // string(0) "" 
 +  my_function(NULL);  // Throw Type Error 
 +} catch (TypeError $e) { 
 +  // Do something important? 
 +
 +</code>
  
 ===== Proposed PHP Version(s) ===== ===== Proposed PHP Version(s) =====
Line 364: Line 410:
  
 "it's a bit late" - We only have a deprecation at the moment (which can and is being ignored), it will be "too late" when PHP 9.0 uses Fatal Errors. "it's a bit late" - We only have a deprecation at the moment (which can and is being ignored), it will be "too late" when PHP 9.0 uses Fatal Errors.
- 
-"Userland scalar types [...] did not include coercion from NULL for //very// good reasons." - The only reason mentioned in [[https://wiki.php.net/rfc/scalar_type_hints_v5|Scalar Type Declarations]] is "to be consistent with our existing type declarations" (no further details given). Talking to developers, the only reason mentioned is noted above, where NULL can be viewed as a missing/invalid value, and passing NULL to a function like //htmlspecialchars()// could indicate a problem (which can a be useful check for static analysis, or in the context of //strict_types=1//). 
  
 The function //mt_rand()// can be called with no arguments, or with min and max integer arguments. A developer may call //mt_rand(NULL, NULL)// and expect it to work the same as no arguments (returning a random number between 0 and //mt_getrandmax()//), but the NULL's would be coerced to 0, so it would always return 0. That said, I cannot find any public examples of this happening ([[https://grep.app/search?q=mt_rand%28NULL&filter%5Blang%5D%5B0%5D=PHP|1]], [[https://grep.app/search?q=mt_rand%5Cs%2A%5C%28%5Cs%2ANULL&regexp=true&filter[lang][0]=PHP|2]], [[https://www.google.com/search?q=%22mt_rand+NULL+NULL%22|3]]). The function //mt_rand()// can be called with no arguments, or with min and max integer arguments. A developer may call //mt_rand(NULL, NULL)// and expect it to work the same as no arguments (returning a random number between 0 and //mt_getrandmax()//), but the NULL's would be coerced to 0, so it would always return 0. That said, I cannot find any public examples of this happening ([[https://grep.app/search?q=mt_rand%28NULL&filter%5Blang%5D%5B0%5D=PHP|1]], [[https://grep.app/search?q=mt_rand%5Cs%2A%5C%28%5Cs%2ANULL&regexp=true&filter[lang][0]=PHP|2]], [[https://www.google.com/search?q=%22mt_rand+NULL+NULL%22|3]]).
Line 377: Line 421:
   - //$method// in [[https://php.net/method_exists|method_exists()]]   - //$method// in [[https://php.net/method_exists|method_exists()]]
   - //$json// in [[https://php.net/json_decode|json_decode()]]   - //$json// in [[https://php.net/json_decode|json_decode()]]
 +
 +It might be appropriate for coercion and explicit casting/converting to work in the same way, even if they were to become stricter in the values they accept; e.g. //intval("")// and //((int) "")// currently return int(0), whereas //(5 + "")// results in a TypeError.
  
 ===== Voting ===== ===== Voting =====
Line 394: Line 440:
 ===== Notes ===== ===== Notes =====
  
-The **15%** of scripts that do not use //strict_types=1// was calculated using [[https://grep.app/|grep.app]], to "search across a half million git repos", were each result is a script (not a count of matches,  [[https://grep.app/search?q=defuse/php-encryption&filter[lang][0]=PHP|example]]). We can see [[https://grep.app/search?q=strict_types&filter[lang][0]=PHP|272,871]] scripts using //strict_types=1//, out of [[https://grep.app/search?q=php&filter[lang][0]=PHP|1,842,666]]. And keep in mind that [[https://grep.app/search?q=class%20wpdb%20%7B&filter[lang][0]=PHP|WordPress only really appears once]], it is [[https://make.wordpress.org/core/2022/01/10/wordpress-5-9-and-php-8-0-8-1/#php-8-1-deprecation-passing-null-to-non-nullable-php-native-functions-parameters|affected by this deprecation]], and is installed/used by many.+The **15%** of scripts that use //strict_types=1// was calculated using [[https://grep.app/|grep.app]], to "search across a half million git repos", were each result is a script (not a count of matches,  [[https://grep.app/search?q=defuse/php-encryption&filter[lang][0]=PHP|example]]). We can see [[https://grep.app/search?q=strict_types&filter[lang][0]=PHP|272,871]] scripts using //strict_types=1//, out of [[https://grep.app/search?q=php&filter[lang][0]=PHP|1,842,666]]. And keep in mind that [[https://grep.app/search?q=class%20wpdb%20%7B&filter[lang][0]=PHP|WordPress only really appears once]], it is [[https://make.wordpress.org/core/2022/01/10/wordpress-5-9-and-php-8-0-8-1/#php-8-1-deprecation-passing-null-to-non-nullable-php-native-functions-parameters|affected by this deprecation]], and is installed/used by many.
  
 In the [[https://wiki.php.net/rfc/scalar_type_hints_v5#behaviour_of_weak_type_checks|Scalar Type Declarations]] RFC for PHP 7.0, scalar types were defined as "int, float, string and bool" - but, despite NULL also being a simple value (i.e. not an array/object/resource), it was not included in this definition. For backwards compatibility reasons this definition is unlikely to change. In the [[https://wiki.php.net/rfc/scalar_type_hints_v5#behaviour_of_weak_type_checks|Scalar Type Declarations]] RFC for PHP 7.0, scalar types were defined as "int, float, string and bool" - but, despite NULL also being a simple value (i.e. not an array/object/resource), it was not included in this definition. For backwards compatibility reasons this definition is unlikely to change.
rfc/null_coercion_consistency.txt · Last modified: 2023/10/18 11:57 by craigfrancis