rfc:null_coercion_consistency

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
rfc:null_coercion_consistency [2022/04/15 07:34] craigfrancisrfc:null_coercion_consistency [2022/10/16 07:37] – Note that Rector can now "help" with this problem. craigfrancis
Line 14: Line 14:
 ===== Introduction ===== ===== Introduction =====
  
-PHP 8.1 introduced [[https://wiki.php.net/rfc/deprecate_null_to_scalar_internal_arg|Deprecate passing null to non-nullable arguments of internal functions]]. While the consistency introduced by the RFC is welcome (user-defined vs internal functions), for those not using //strict_types=1//, the partial breaking of NULL coercion creates an upgrade problem.+PHP 8.1 introduced [[https://wiki.php.net/rfc/deprecate_null_to_scalar_internal_arg|Deprecate passing null to non-nullable arguments of internal functions]]. While the consistency is welcome (user-defined vs internal functions), for those **not** using strict Static Analysis or //strict_types=1//, the breaking of NULL coercion creates an upgrade problem.
  
-This RFC does **not** change anything for scripts using //strict_types=1//, as type checks are useful in that context. For example, developers who view NULL as a missing/invalid value (not as a value in itself), consider passing NULL to a function like //htmlspecialchars()// as something that indicates a problem for them.+This RFC does **not** change anything for //strict_types=1// (or strict static analysis), as strict type checks can be useful. For example, developers can view NULL as a missing/invalid value (not as a value in itself), and passing NULL to a function like //htmlspecialchars()// could indicate a problem.
  
-Roughly **85%** scripts do not use //strict_types=1// (calculation below).+Roughly **15%** of scripts use //strict_types=1// (calculation below).
  
 Roughly **33%** of developers use static analysis (realistically it's less than this, details below). Roughly **33%** of developers use static analysis (realistically it's less than this, details below).
  
-There was a [[https://externals.io/message/112327|short discussion]] about the original RFC, but with the exception of Craig Duncan, there was no consideration for the problems this creates with existing code (or the inconsistency of NULL coercion compared to string/int/float/bool coercion).+There was a [[https://externals.io/message/112327|short discussion]] about the original RFC, but with the exception of Craig Duncan, there was no consideration for the problems this creates with existing code (or the inconsistency of NULL coercion compared to string/int/float/bool coercion; or other contexts like string concatenation, == comparisons, arithmetics, sprintf, etc).
  
-The general direction of [[https://github.com/Girgias/unify-typing-modes-rfc|Unify PHP's typing modes]] by Girgias is correctbecause automatic coercions like //substr($string, "offset")// and //htmlspecialchars(array())// are clearly problematic; but the following is common, and has been fine:+The intention is to also keep [[https://github.com/Girgias/unify-typing-modes-rfc|Unify PHP's typing modes]] by George Peter Banyard in mindwith coercions like //substr($string, "offset")// and //htmlspecialchars(array())// as being clearly problematic; whereas the following is common, and has been fine:
  
 <code php> <code php>
-$search = filter_input(INPUT_GET, 'q'); // Or similar examples below.+$search = filter_input(INPUT_GET, 'q'); // Or similar (examples below)
  
 echo 'Results for ' . htmlspecialchars($search); echo 'Results for ' . htmlspecialchars($search);
Line 45: Line 45:
   - [[https://www.php.net/manual/en/language.types.boolean.php|To Boolean]]: "When converting to bool, the following values are considered false [...] the special type NULL"   - [[https://www.php.net/manual/en/language.types.boolean.php|To Boolean]]: "When converting to bool, the following values are considered false [...] the special type NULL"
  
-For example:+==== Current State ====
  
 <code php> <code php>
 +// echo(string ...$expressions): void
 +// print(string $expression): int
 print('A'); print('A');
 print(1); print(1);
Line 53: Line 55:
 print(false); print(false);
 print(NULL); // Fine, coerced to empty string. print(NULL); // Fine, coerced to empty string.
 +
 +var_dump(3 + '5' + NULL); // Fine, int(8)
 +var_dump(NULL / 6); // Fine, int(0)
  
 $o = []; $o = [];
Line 78: Line 83:
 </code> </code>
  
-With user-defined functions, while this does not cause a backwards compatibility issue (details below), it still highlights the coercion inconsistency, and that some type checking is happening in an environment that does not expect type checking:+With user-defined functions, while there hasn't been a backwards compatibility issue, it still highlights the coercion inconsistency, in an environment that does not expect type checking, despite NULL being a value that "can be coerced to the type requested by the hint without data loss and without creation of likely unintended data":
  
 <code php> <code php>
Line 101: Line 106:
 </code> </code>
  
-Arrays, Resources, and Objects (without //__toString()//) cannot be coerced (for fairly obvious reasons).+Arrays, Resources, and Objects (without toString) cannot be coerced (for fairly obvious reasons).
  
 String/Int/Float/Bool can be coerced. String/Int/Float/Bool can be coerced.
  
-NULL can usually be coerced, but...+NULL can usually be coerced (e.g. string concatenation, == comparisons, arithmetics, sprintf, print, echo, array keys), but...
  
-  - PHP 7.0 introduced the ability for user-defined functions to specify parameter types via the [[https://wiki.php.net/rfc/scalar_type_hints_v5#behaviour_of_weak_type_checks|Scalar Type Declarations RFC]], where the focus was on //strict_types=1//. But the implementation also caused Type Errors when not using //strict_types=1//, which seems more of an over-sight, with coercion working for string/int/float/bool but not null (despite the documentation), and introducing a type check (for an environment that does not expect this). +  - PHP 7.0 introduced the ability for user-defined functions to specify parameter types via the [[https://wiki.php.net/rfc/scalar_type_hints_v5#behaviour_of_weak_type_checks|Scalar Type Declarations RFC]], where the implementation triggered Type Errors for those using //strict_types=1//, and otherwise used coercion for string/int/float/boolbut not NULL. 
-  - PHP 8.continued this inconsistency with internal functions.+  - PHP 8.1 updated internal function parameters to work in the same way. 
 + 
 +==== Scalar Types ==== 
 + 
 +[[https://news-web.php.net/php.internals/117523|George Peter Banyard]] notes that "Userland scalar types [...] did not include coercion from NULL for //very// good reasons". The only reason mentioned in [[https://wiki.php.net/rfc/scalar_type_hints_v5|Scalar Type Declarations]] is "to be consistent with our existing type declarations" (no further details given). 
 + 
 +The RFC also says "it should be possible for existing userland libraries to add scalar type declarations without breaking compatibility", but this is not the case, because of NULL. This has made adoption of type declarations harder, as it does not work like the following: 
 + 
 +<code php> 
 +function my_function($s, $i, $f, $b{ 
 +  $s = strval($s); 
 +  $i = intval($i); 
 +  $f = floatval($f); 
 +  $b = boolval($b); 
 +  var_dump($s, $i, $f, $b); 
 +
 + 
 +function my_function(string $s, int $i, float $f, bool $b) { 
 +  var_dump($s, $i, $f, $b); 
 +
 + 
 +my_function(NULL, NULL, NULL, NULL); 
 +</code> 
 + 
 +Some developers view NULL as a missing/invalid value, and passing NULL to a function like //htmlspecialchars()// could indicate a problem (can a be useful check for static analysis, or in the context of //strict_types=1//).
  
 ==== Examples ==== ==== Examples ====
Line 126: Line 155:
 $search = $request->getGet('q'); // CodeIgniter $search = $request->getGet('q'); // CodeIgniter
  
-$value = array_pop($empty_array); 
 $value = mysqli_fetch_row($result); $value = mysqli_fetch_row($result);
 $value = json_decode($json); // Invalid JSON, or nesting limit. $value = json_decode($json); // Invalid JSON, or nesting limit.
 +$value = array_pop($empty_array);
 </code> </code>
  
-Examples, often working with user input, where NULL has previously been fine:+Examples functions, often working with user input, where NULL has been fine:
  
 <code php> <code php>
 +$rounded_value = round($value);
 +
 $search_trimmed = trim($search); $search_trimmed = trim($search);
  
Line 164: Line 195:
 mail('nobody@example.com', 'subject', 'message', NULL, '-fwebmaster@example.com'); mail('nobody@example.com', 'subject', 'message', NULL, '-fwebmaster@example.com');
 </code> </code>
 +
 +HTML Templating engines like [[https://github.com/laravel/framework/blob/ab1506091b9f166b312b3990d07b2e21d971f2e6/src/Illuminate/Support/helpers.php#L119|Laravel Blade]] suppress this deprecation with null-coalescing ([[https://github.com/laravel/framework/pull/36262/files#diff-15b0a3e2eb2d683222d19dfacc04c616a3db4e3d3b3517e96e196ccbf838f59eR118|patch]]); or [[https://github.com/twigphp/Twig/blob/b4d6723715da57667cca851051eba3786714290d/src/Extension/EscaperExtension.php#L195|Symphony Twig]] which preserves NULL, but it's often passed to //echo// (which accepts it, despite the [[https://www.php.net/echo|echo documentation]] saying it accepts non-nullable strings).
 +
 +I'd argue a very strict level of type checking (that prevents all forms of coercion) is best done by Static Analysis, which can check if a variable can be nullable, and it can decide if this is a problem, in the same way that a string (e.g. '15') being provided to integer parameter could be seen as a problem.
  
 There are approximately [[https://github.com/craigfrancis/php-allow-null-rfc/blob/main/functions-change.md|335 parameters affected by this deprecation]]. There are approximately [[https://github.com/craigfrancis/php-allow-null-rfc/blob/main/functions-change.md|335 parameters affected by this deprecation]].
Line 175: Line 210:
 It is possible to use very strict Static Analysis, to follow every variable from source to sink (to check if a variable could be NULL), but most developers are not in a position to do this (i.e. not using static analysis, or not at a high enough level, or they are using a baseline to ignore). It is possible to use very strict Static Analysis, to follow every variable from source to sink (to check if a variable could be NULL), but most developers are not in a position to do this (i.e. not using static analysis, or not at a high enough level, or they are using a baseline to ignore).
  
-In the last JetBrains developer survey, where 67% regularly used Laravel, **only 33% used Static Analysis** ([[https://www.jetbrains.com/lp/devecosystem-2021/php/#PHP_do-you-use-static-analysis|source]]); where it's fair to say many still would not identify these possible NULL values.+In the last JetBrains developer survey (with 67% regularly using Laravel), **only 33% used Static Analysis** ([[https://www.jetbrains.com/lp/devecosystem-2021/php/#PHP_do-you-use-static-analysis|source]]); where it's fair to say many of these developers would //still// not identify these possible NULL values (too low level, and/or using a baseline).
  
 As an example, take this simple script: As an example, take this simple script:
Line 187: Line 222:
 </code> </code>
  
-It'considered fine today by the relevant tools: +This is considered fine by these tools:
- +
-<code cli> +
-composer require --dev vimeo/psalm +
-./vendor/bin/psalm --init ./src/ 4 +
-./vendor/bin/psalm +
-No errors found! +
-</code> +
-Note: Psalm can detect this problem at [[https://psalm.dev/docs/running_psalm/error_levels/|levels 1, 2, and 3]] (don't use a baseline). +
- +
-<code cli> +
-composer require --dev phpstan/phpstan +
-./vendor/bin/phpstan analyse -l 9 ./src/ +
-[OK] No errors +
-</code> +
- +
-<code cli> +
-composer require --dev phpstan/phpstan-strict-rules +
-composer require --dev phpstan/extension-installer +
-./vendor/bin/phpstan analyse -l 9 ./src/ +
-[OK] No errors +
-</code> +
-Note: There are [[https://phpstan.org/config-reference#stricter-analysis|Stricter Analysis]] options for PHPStan, but they don't seem to help with this problem. +
- +
-<code cli> +
-composer require --dev rector/rector +
-./vendor/bin/rector init +
-./vendor/bin/rector process ./src/ +
-[OK] Rector is done! +
-</code>+
  
 <code cli> <code cli>
Line 259: Line 265:
  
 Note: Juliette (@jrfnl) has confirmed that getting PHPCompatibility to solve this problem will be "pretty darn hard to do" because it's "not reliably sniffable" ([[https://twitter.com/jrf_nl/status/1497937320766496772|source]]). Note: Juliette (@jrfnl) has confirmed that getting PHPCompatibility to solve this problem will be "pretty darn hard to do" because it's "not reliably sniffable" ([[https://twitter.com/jrf_nl/status/1497937320766496772|source]]).
 +
 +<code cli>
 +composer require --dev phpstan/phpstan
 +./vendor/bin/phpstan analyse -l 9 ./src/
 +[OK] No errors
 +</code>
 +
 +<code cli>
 +composer require --dev phpstan/phpstan-strict-rules
 +composer require --dev phpstan/extension-installer
 +./vendor/bin/phpstan analyse -l 9 ./src/
 +[OK] No errors
 +</code>
 +Note: There are [[https://phpstan.org/config-reference#stricter-analysis|Stricter Analysis]] options for PHPStan, but they don't seem to help with this problem.
 +
 +<code cli>
 +composer require --dev vimeo/psalm
 +./vendor/bin/psalm --init ./src/ 4
 +./vendor/bin/psalm
 +No errors found!
 +</code>
 +Note: Psalm can detect this at [[https://psalm.dev/docs/running_psalm/error_levels/|levels 1, 2, and 3]] (don't use a baseline).
 +
 +==== One Solution ====
 +
 +Since [[https://github.com/rectorphp/rector-src/pull/2543|21st June 2022]], Rector can modify 362 function arguments via //NullToStrictStringFuncCallArgRector//:
 +
 +<code cli>
 +mkdir -p rector/src;
 +
 +cd rector/;
 +
 +composer require --dev rector/rector;
 +
 +echo '<?= htmlspecialchars($var) ?>' > src/index.php;
 +
 +echo '<?php
 +
 +use Rector\Php81\Rector\FuncCall\NullToStrictStringFuncCallArgRector;
 +use Rector\Config\RectorConfig;
 +
 +return static function (RectorConfig $rectorConfig): void {
 +    $rectorConfig->paths([__DIR__ . "/src"]);
 +    $rectorConfig->rule(NullToStrictStringFuncCallArgRector::class);
 +};
 +' > rector.php;
 +
 +./vendor/bin/rector process;
 +</code>
 +
 +This will litter the code with the use of //(string)// type casting, e.g.
 +
 +<code diff>
 +-<?= htmlspecialchars($var) ?>
 ++<?= htmlspecialchars((string) $var) ?>
 +</code>
 +
 +For a typical project (which won't be using strict_types), expect thousands of changes to be made; and note how this does not improve code quality.
  
 ==== Temporary Solutions ==== ==== Temporary Solutions ====
Line 294: Line 358:
 </code> </code>
  
-As noted above - PHPCompatibility, CodeSniffer, Rector, etc are unable to find or update these cases.+As noted above - Rector can add //(string)// type casting automaticallybut I have no idea how this improves code quality.
  
 ===== Proposal ===== ===== Proposal =====
  
-No change for those using //strict_types=1//.+Revert the NULL deprecation for parameters (when **not** using //strict_types=1//), so it continues to work (as NULL coercion does in other contexts), to avoid the upgrade problems (i.e. Fatal Errors in PHP 9.0).
  
-Must keep the spirit of the original RFC, and keep user-defined and internal functions consistent.+And, in the spirit of the original RFC to keep user-defined and internal functions consistent, also change user-defined functions so NULL is coerced for non-nullable parameters (when **not** using //strict_types=1//).
  
-Revert the deprecation of NULL coercion when **not** using //strict_types=1//, to work as documented, be consistent with scalar types, and (most importantly) avoid the upgrade problems (i.e. Fatal Errors for PHP 9.0). +This means the type "//?int//" will allow NULL or an integer to be provided to the function; whereas the non-nullable type "//int//" would coerce NULL to 0, in the same way the string "0" would be.
- +
-For consistency, change user-defined functions to allow NULL to be coerced when **not** using //strict_types=1//.+
  
 ===== Backward Incompatible Changes ===== ===== Backward Incompatible Changes =====
  
-None+While the intention of this RFC is to avoid a BC break; for user defined functions to be updated to also coerce NULL (instead of throwing a Type Error), it's possible some code may rely on that behaviour, for example: 
 + 
 +<code php> 
 +function my_function(string $my_string) { 
 +  var_dump($my_string); 
 +
 + 
 +try { 
 +  my_function('A');   // string(1) "A" 
 +  my_function(1);     // string(1) "1" 
 +  my_function(1.2);   // string(3) "1.2" 
 +  my_function(true);  // string(1) "1" 
 +  my_function(false); // string(0) "" 
 +  my_function(NULL);  // Throw Type Error 
 +} catch (TypeError $e) { 
 +  // Do something important? 
 +
 +</code>
  
 ===== Proposed PHP Version(s) ===== ===== Proposed PHP Version(s) =====
Line 329: Line 408:
  
 ===== Open Issues ===== ===== Open Issues =====
- 
-"terrible idea" - I'm still waiting to hear details. 
  
 "it's a bit late" - We only have a deprecation at the moment (which can and is being ignored), it will be "too late" when PHP 9.0 uses Fatal Errors. "it's a bit late" - We only have a deprecation at the moment (which can and is being ignored), it will be "too late" when PHP 9.0 uses Fatal Errors.
  
-"Userland scalar types [...] did not include coercion from NULL for //very// good reasons."The only reason mentioned in [[https://wiki.php.net/rfc/scalar_type_hints_v5|Scalar Type Declarations]] is "to be consistent with our existing type declarations" (no further details given)Talking to developersthe only reason mentioned is noted above, where NULL can be viewed as a missing/invalid value, and passing NULL to a function like //htmlspecialchars()// could indicate a problem (while that might be useful in the context of //strict_types=1//, it hasn't been the case for everyone else).+The function //mt_rand()// can be called with no arguments, or with min and max integer argumentsA developer may call //mt_rand(NULL, NULL)// and expect it to work the same as no arguments (returning random number between 0 and //mt_getrandmax()//), but the NULL's would be coerced to 0, so it would always return 0. That said, I cannot find any public examples of this happening ([[https://grep.app/search?q=mt_rand%28NULL&filter%5Blang%5D%5B0%5D=PHP|1]], [[https://grep.app/search?q=mt_rand%5Cs%2A%5C%28%5Cs%2ANULL&regexp=true&filter[lang][0]=PHP|2]][[https://www.google.com/search?q=%22mt_rand+NULL+NULL%22|3]]).
  
 ===== Future Scope ===== ===== Future Scope =====
Line 344: Line 421:
   - //$method// in [[https://php.net/method_exists|method_exists()]]   - //$method// in [[https://php.net/method_exists|method_exists()]]
   - //$json// in [[https://php.net/json_decode|json_decode()]]   - //$json// in [[https://php.net/json_decode|json_decode()]]
 +
 +It might be appropriate for coercion and explicit casting/converting to work in the same way, even if they were to become stricter in the values they accept; e.g. //intval("")// and //((int) "")// currently return int(0), whereas //(5 + "")// results in a TypeError.
  
 ===== Voting ===== ===== Voting =====
Line 361: Line 440:
 ===== Notes ===== ===== Notes =====
  
-The **85%** of scripts that do not use //strict_types=1// was calculated using [[https://grep.app/|grep.app]], to "search across a half million git repos", were each result is a script (not a count of matches,  [[https://grep.app/search?q=defuse/php-encryption&filter[lang][0]=PHP|example]]). We can see [[https://grep.app/search?q=strict_types&filter[lang][0]=PHP|269,701]] scripts using //strict_types=1//, out of [[https://grep.app/search?q=php&filter[lang][0]=PHP|1,830,411]]. And keep in mind that [[https://grep.app/search?q=class%20wpdb%20%7B&filter[lang][0]=PHP|WordPress only really appears once]], it is [[https://make.wordpress.org/core/2022/01/10/wordpress-5-9-and-php-8-0-8-1/#php-8-1-deprecation-passing-null-to-non-nullable-php-native-functions-parameters|affected by this deprecation]], and is installed/used by many.+The **15%** of scripts that use //strict_types=1// was calculated using [[https://grep.app/|grep.app]], to "search across a half million git repos", were each result is a script (not a count of matches,  [[https://grep.app/search?q=defuse/php-encryption&filter[lang][0]=PHP|example]]). We can see [[https://grep.app/search?q=strict_types&filter[lang][0]=PHP|272,871]] scripts using //strict_types=1//, out of [[https://grep.app/search?q=php&filter[lang][0]=PHP|1,842,666]]. And keep in mind that [[https://grep.app/search?q=class%20wpdb%20%7B&filter[lang][0]=PHP|WordPress only really appears once]], it is [[https://make.wordpress.org/core/2022/01/10/wordpress-5-9-and-php-8-0-8-1/#php-8-1-deprecation-passing-null-to-non-nullable-php-native-functions-parameters|affected by this deprecation]], and is installed/used by many.
  
 In the [[https://wiki.php.net/rfc/scalar_type_hints_v5#behaviour_of_weak_type_checks|Scalar Type Declarations]] RFC for PHP 7.0, scalar types were defined as "int, float, string and bool" - but, despite NULL also being a simple value (i.e. not an array/object/resource), it was not included in this definition. For backwards compatibility reasons this definition is unlikely to change. In the [[https://wiki.php.net/rfc/scalar_type_hints_v5#behaviour_of_weak_type_checks|Scalar Type Declarations]] RFC for PHP 7.0, scalar types were defined as "int, float, string and bool" - but, despite NULL also being a simple value (i.e. not an array/object/resource), it was not included in this definition. For backwards compatibility reasons this definition is unlikely to change.
rfc/null_coercion_consistency.txt · Last modified: 2023/10/18 11:57 by craigfrancis