rfc:coercive_sth

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
rfc:coercive_sth [2015/03/10 12:01]
zeev
rfc:coercive_sth [2020/08/01 23:52]
carusogabriel Status is "Declined"
Line 1: Line 1:
 ====== PHP RFC: Coercive Types for Function Arguments ====== ====== PHP RFC: Coercive Types for Function Arguments ======
-  * Version: 0.50+  * Version: 0.60
   * Date: 2015-02-27   * Date: 2015-02-27
   * Authors: Zeev Suraski <​zeev@php.net>,​ Francois Laupretre <​francois@php.net>,​ Dmitry Stogov <​dmitry@php.net>​   * Authors: Zeev Suraski <​zeev@php.net>,​ Francois Laupretre <​francois@php.net>,​ Dmitry Stogov <​dmitry@php.net>​
-  * Status: ​Under discussion+  * Status: ​Declined
   * First Published at: http://​wiki.php.net/​rfc/​coercive_sth   * First Published at: http://​wiki.php.net/​rfc/​coercive_sth
- 
-===== Preamble ===== 
- 
-This version is published as '​near-final',​ and with the exception of potential minor modifications,​ is what will be proposed for a vote. 
  
 ===== Background & Summary ===== ===== Background & Summary =====
Line 85: Line 81:
   - Unlike user-land scalar type hints, internal functions will accept nulls as valid scalars. ​ Based on preliminary testing, this is an extremely common use case, most often used in conjunction with uninitialized values. Disallowing it - language-wide for all internal functions - may be too big of a shift. ​ Therefore, internal functions receiving a NULL (non-)value for a scalar will accept it, and convert it to 0 or an empty string in the same way PHP 5 does.  Note that this discrepancy may be resolved in the future - by introducing new options for arguments that would explicitly reject NULL values, in the same manner user-land STH do.  However, since this requires substantial auditing of internal functions - especially ones that have default values but don't explicitly declare themselves as accepting NULLs - it's outside the scope of this RFC and will be revisited for 7.1.  Note that if the [[Nullable Types RFC|https://​wiki.php.net/​rfc/​nullable_types]] is accepted it will further reduce this discrepancy,​ by allowing user-land functions and internal functions the same level of granularity in terms of accepting or rejecting NULL values for function arguments.   - Unlike user-land scalar type hints, internal functions will accept nulls as valid scalars. ​ Based on preliminary testing, this is an extremely common use case, most often used in conjunction with uninitialized values. Disallowing it - language-wide for all internal functions - may be too big of a shift. ​ Therefore, internal functions receiving a NULL (non-)value for a scalar will accept it, and convert it to 0 or an empty string in the same way PHP 5 does.  Note that this discrepancy may be resolved in the future - by introducing new options for arguments that would explicitly reject NULL values, in the same manner user-land STH do.  However, since this requires substantial auditing of internal functions - especially ones that have default values but don't explicitly declare themselves as accepting NULLs - it's outside the scope of this RFC and will be revisited for 7.1.  Note that if the [[Nullable Types RFC|https://​wiki.php.net/​rfc/​nullable_types]] is accepted it will further reduce this discrepancy,​ by allowing user-land functions and internal functions the same level of granularity in terms of accepting or rejecting NULL values for function arguments.
   - As we don't clearly define a date for switching E_DEPRECATE to fatal errors, the RFC states that such decision cannot, in any cases, be made before a delay of 5 years after the first public release of a PHP distribution containing the STH features described here. This statement is voted upon as the rest of the RFC. So, it cannot be violated without a new vote on this specific subject. This statement is provided as a guarantee to the developers that they will have ample time to fix their code.   - As we don't clearly define a date for switching E_DEPRECATE to fatal errors, the RFC states that such decision cannot, in any cases, be made before a delay of 5 years after the first public release of a PHP distribution containing the STH features described here. This statement is voted upon as the rest of the RFC. So, it cannot be violated without a new vote on this specific subject. This statement is provided as a guarantee to the developers that they will have ample time to fix their code.
 +
 +=== Impact on Real World Applications ===
 +
 +The patch has been tested with numerous real world apps and frameworks, to attempt to gauge the impact the changes to the internal functions rules would have:
 +
 +  * Drupal 7 homepage: ​ One new E_DEPRECATED warning, which seems to catch faulty-looking code
 +  * Drupal 7 admin interface (across the all pages): ​ One  new E_DEPRECATED warning, which again seems to catch a real bug – stripslsahes() operating on a boolean.
 +  * Magento 1.9 homepage (w/ Magento'​s huge sample DB):  One new E_DEPRECATED warning, again, seems to be catching a real bug of ‘false’ being fed as argument 1 of in json_decode() – which is expecting a string full of json there.
 +  * WordPress 3.4 homepage: ​ One new E_DEPRECATED warning, again, seems to be catching a real bug of ‘false’ being fed as argument 1 of substr().
 +  * Zend Framework 2 skeleton app:  Zero new E_DEPRECATED warnings.
 +  * Symfony ACME app:  Zero new E_DEPRECATED warnings (across the app).
 +  * PHPUnit: ​ Several E_DEPRECATED issues that were fixed in a matter of hours.
 +
 +The negative impact on real world apps appears to be very, very limited - which is consistent with the premise that the Coercive STH RFC aims to allow the conversions which are common and most likely sensible, and block the ones which are likely faulty - which means we shouldn'​t see too many of those in real world apps.
 +
 +In addition, the patch was tested with numerous unit-test suites; ​ PHP's test suite shows a lot of new errors, however, the majority of them stem from tests purposely designed to check '​insensible'​ conversions (e.g. readgzfile($filename,​ -10.5)), and not code blocks that we're ever likely to bump into in the real world.
 +
 +The Symfony and Zend Framework test suites were also run and showed new deprecation errors; ​ Based on very preliminary analysis, it seems that most of them either fall into the same bucket as the PHP unit tests above (purposely designed to test insensible conversions),​ or seem to point out issues that may translate into real world bugs, and that can be fixed in a relatively small number of changes.
 +
 +All in all, the signal to noise ratio of turning the new coercive rules for the entirety of PHP seems to be very good.
  
 === Difference from PHP 5 === === Difference from PHP 5 ===
Line 113: Line 129:
   "​8.2"​ -> int           # "​8.2"​ cannot be converted to an integer without data loss   "​8.2"​ -> int           # "​8.2"​ cannot be converted to an integer without data loss
   4.3 -> bool            # No more conversion from float to bool   4.3 -> bool            # No more conversion from float to bool
-  "​foo"​ -> bool          # No more conversion from string to bool 
   "7 dogs" -> int        # Non-blank trailing characters no longer supported   "7 dogs" -> int        # Non-blank trailing characters no longer supported
   "3.14 pizzas"​ -> float # Non-blank trailing characters no longer supported ​   "3.14 pizzas"​ -> float # Non-blank trailing characters no longer supported ​
Line 133: Line 148:
 Numerous community members have invested substantial effort into creating another comprehensive RFC, that proposes to introduce STH into PHP [[https://​wiki.php.net/​rfc/​scalar_type_hints_v5|Scalar Type Hints RFC v0.5 ("Dual Mode RFC"​)]]. ​ However, we believe the proposal in this RFC is better, for several different reasons: Numerous community members have invested substantial effort into creating another comprehensive RFC, that proposes to introduce STH into PHP [[https://​wiki.php.net/​rfc/​scalar_type_hints_v5|Scalar Type Hints RFC v0.5 ("Dual Mode RFC"​)]]. ​ However, we believe the proposal in this RFC is better, for several different reasons:
  
-  - **Single Mode.** ​ Even though the Dual Mode RFC presents a novel idea about how to allow developers to choose which mode they'd like to use, and use different modes in different parts of the app, it still introduces the burden of two different modes. ​ Two different rule-sets that need to be learned, may increase the language'​s complexity. ​ Further, the two sets can cause the same functions to behave differently depending on where they'​re being called, and potentially a new class of bugs stemming from developers not realizing which mode they'​re in in a particular file.  This RFC is unaffected by these issues, as it presents a single, composite rule set.+  - **Single Mode.**  Thanks to the fact this RFC proposes a single mode and enables it across the board for any code that's run, you get the benefit of stricter-yet-sensible rules overnight. ​ Thanks to the good signal-to-noise ratio, moving to PHP 7 is likely to help people find real world bugs, without having to proactively enable stricter modes and without having to introduce a lot of changes to their code. 
 +  - **Smaller cognitive burden.**  Even though the Dual Mode RFC presents a novel idea about how to allow developers to choose which mode they'd like to use, and use different modes in different parts of the app, it still introduces the burden of two different modes. ​ Two different rule-sets that need to be learned increase the language'​s complexity. ​ Further, the two sets can cause the same functions to behave differently depending on where they'​re being called, and potentially a new class of bugs stemming from developers not realizing which mode they'​re in in a particular file.  This RFC is unaffected by these issues, as it presents a single, composite rule set.
   - **Too strict may lead to too lax.** In the Dual Mode RFC, when in Strict mode, in many cases, functions would reject values that, semantically,​ are acceptable. ​ For example, a "​32"​ (string) value coming back from an integer column in a database table, would not be accepted as valid input for a function expecting an integer. ​ Since semantically the developer is interested in this argument-passing succeeding, they would have the choice of either removing the integer STH altogether, or, more likely, explicitly casting the value into an integer. ​ This would have the opposite of the desired outcome of strict STHs - as explicit casts ($foo = (int) $foo;) always succeed, and would happily convert "100 dogs", "​Apples"​ and even arrays and booleans into an integer. ​ Further, since already today, internal functions employ coercion rules that are more restrictive than PHP's explicit casting, pushing people towards explicit casting will actually make things **worse** in case developers opt for explicit casting as they pass values in an internal function call.    - **Too strict may lead to too lax.** In the Dual Mode RFC, when in Strict mode, in many cases, functions would reject values that, semantically,​ are acceptable. ​ For example, a "​32"​ (string) value coming back from an integer column in a database table, would not be accepted as valid input for a function expecting an integer. ​ Since semantically the developer is interested in this argument-passing succeeding, they would have the choice of either removing the integer STH altogether, or, more likely, explicitly casting the value into an integer. ​ This would have the opposite of the desired outcome of strict STHs - as explicit casts ($foo = (int) $foo;) always succeed, and would happily convert "100 dogs", "​Apples"​ and even arrays and booleans into an integer. ​ Further, since already today, internal functions employ coercion rules that are more restrictive than PHP's explicit casting, pushing people towards explicit casting will actually make things **worse** in case developers opt for explicit casting as they pass values in an internal function call. 
   - **Smooth integration with Data Sources**. ​ PHP uses strings extensively across the language, and in most cases, data sources always feed data into PHP as strings. ​ PHP applications rely extensively on internal type juggling to convert that string-based data according to the needed context. ​ Strict zval.type based STH effectively eliminates this behavior, moving the burden of worrying about type conversion to the user.  The solution proposed in this RFC allows code that relies on type coercion to Just Work when the values are sensible, but fail (and appropriately warn the developer) otherwise.   - **Smooth integration with Data Sources**. ​ PHP uses strings extensively across the language, and in most cases, data sources always feed data into PHP as strings. ​ PHP applications rely extensively on internal type juggling to convert that string-based data according to the needed context. ​ Strict zval.type based STH effectively eliminates this behavior, moving the burden of worrying about type conversion to the user.  The solution proposed in this RFC allows code that relies on type coercion to Just Work when the values are sensible, but fail (and appropriately warn the developer) otherwise.
 +  - **Forward compatibility for internal function calls**. ​ Codebases which will be tested & improved to work on PHP 7, can run without a problem on PHP 5 and benefit from the improved code quality.
  
  
Line 160: Line 176:
   * **Static Analysis** - Analyzing code without running it, attempting to derive conclusions about security, performance,​ etc.   * **Static Analysis** - Analyzing code without running it, attempting to derive conclusions about security, performance,​ etc.
  
-===== Proposed Voting Choices ​===== +===== Vote ===== 
-The voting choices ​would be yes/no.+The voting choices ​are yes (in favor for accepting this RFC for PHP 7) or no (against it).
 The RFC proposes a very substantial change to PHP's coercion rules, which may evolve to affect implicit typing in the future. The RFC proposes a very substantial change to PHP's coercion rules, which may evolve to affect implicit typing in the future.
-It absolutely requires a 2/3 majority, with the hope of reaching as close as possible to consensus.+It absolutely requires a 2/3 majority, with the hope of reaching as close as possible to consensus. ​ The vote starts on March 11th, and will end two weeks later, on March 25th. 
 + 
 +<doodle title="​coercive_sth"​ auth="​zeev"​ voteType="​single"​ closed="​true">​ 
 +   * Yes 
 +   * No 
 +</​doodle>​
  
 ===== Patches and Tests ===== ===== Patches and Tests =====
 https://​github.com/​php/​php-src/​pull/​1125/​files https://​github.com/​php/​php-src/​pull/​1125/​files
rfc/coercive_sth.txt · Last modified: 2020/08/01 23:52 by carusogabriel