Scalar Type Hints have been a top requested feature for PHP for a very, very long time. There have been numerous attempts at introducing them to the language, all of which failed to make it into the language thus far.
While there seems to be consensus regarding the viability and usefulness of adding Scalar Type Hints (STH), theres been a long standing debate regarding what is the correct way to implement them. The two key schools of thoughts around STH that emerged over the years are:
Its important to note that in terms of the code *inside* the callee, theres absolutely no difference between the two schools of thought. In both cases, the callee can rely with absolute confidence that if it type hinted a certain argument as an int, this argument will always be an int when function execution begins. The difference is localized to the behavior surrounding the invocation of the callee by the caller, with Strict STH rejecting a lot more potential inputs, compared to Dynamic STH.
Proponents of Strict STH cite numerous advantages, primarily around code safety/security. In their view, the conversion rules proposed by Dynamic STH can easily allow garbage input to be silently converted into arguments that the callee will accept but that may, in many cases, hide difficult-to-find bugs or otherwise result in unexpected behavior.
Proponents of Dynamic STH bring up consistency with the rest of the language, including some fundamental type-juggling aspects that have been key tenets of PHP since its inception. Strict STH, in their view, is inconsistent with these tenets.
This RFC proposes a composite solution, which attempts to address the main goals of both camps, dubbed Coercive STH. Coercive STH is less restrictive than simple zval.type checks, but a lot more restrictive than the conversion rules presently employed by internal functions. It attempts to strike a balance between rejecting erroneous input, and allowing valid-but-wrongly-typed input, and outlines a gradual roadmap for transitioning internal functions to this new rule-set.
Finally, the RFC outlines a potential future evolution of employing the new rule-set into additional parts of PHP, most notably implicit type conversions (outside the scope of this specific RFC).
A new set of coercion rules will apply to both user-land type hints and internal type hints. The guiding principals behind these new rules are:
Here are the rules we get when applying these changes to the rules currently used in PHP 5 :
Value Type | ||||
---|---|---|---|---|
Hint | boolean | int | float | string |
bool | Accept | Accept* | Reject | Accept* |
int | Accept | Accept | Only if no DL† | Numeric integer string only‡ |
float | Accept | Only if no DL† | Accept | Numeric string only‡ |
string | Accept | Accept | Accept | Accept |
* Coercion from int or string into bool is done using the same rules that apply in the rest of PHP; 0, “0” and “” coerce to false; Any other value coerces to true.
† Float to int coercion will be accepted only if there are no significant digits after the decimal point. E.g. 7.0 will be coerced to 7, but 7.3 will be rejected. Int to float coercion will be accepted only if the integer value can be represented without loss of accuracy using a floating point number. Extremely large integers (with absolute value larger than 2^52) will be rejected.
‡ Numeric strings may be converted to int or float types, only in case there is no loss in data or accuracy. Leading zeroes, as well as leading and trailing whitespaces are accepted. Other non-numeric trailing data will be rejected. For int hints, numeric strings with significant digits after the decimal point will be rejected. For floating point hints, integer values that cannot be represented without loss of accuracy (exceed 2^52 in absolute value) will be rejected as well.
Generally speaking, coercion from non-scalar values into scalar type hints is not supported and will be rejected, with few exceptions.
Arrays and resources will always be rejected as valid inputs for scalar type hinted arguments . Objects will always be rejected as valid inputs for scalar type hinted arguments, with one exception - an object with a __toString() method will be accepted for string type hint. Nulls will be rejected as valid inputs for scalar type hinted arguments when using user-land type hints, but presently accepted for internal functions. See the 'Changes to Internal Functions' section for more information.
This RFC proposes to introduce four new type hints into PHP – int, float, string and bool. These new hints will adhere to the new coercion rules detailed above. Values that cannot be accepted per the coercion rules above, will result in E_RECOVERABLE_ERROR being triggered. Note that if the Exceptions in the Engine RFC is accepted, this will throw an exception instead, making recovery simpler and more straightforward.
These type hints can be used for function arguments, as well as for return values, as described in the Return Type Declarations RFC. In both cases, they are handled exactly the same way.
No type declaration for resources is added, as this would prevent moving from resources to objects for existing extensions which some have already done (e.g. GMP).
This RFC proposes to bring the rule-set described in the last section to internal functions as well, through updates to the zend_parse_parameters() function.
However, given that unlike the introduction of STH - which is a new, previously unused feature that will (for the most part) not affect existing code - changes to what internal functions would be willing to accept could have substantial compatibility implications.
To mitigate the risk of compatibility breakage being introduced between PHP 5.6 and 7.0, two mitigation steps are proposed:
The patch has been tested with numerous real world apps and frameworks, to attempt to gauge the impact the changes to the internal functions rules would have:
The negative impact on real world apps appears to be very, very limited - which is consistent with the premise that the Coercive STH RFC aims to allow the conversions which are common and most likely sensible, and block the ones which are likely faulty - which means we shouldn't see too many of those in real world apps.
In addition, the patch was tested with numerous unit-test suites; PHP's test suite shows a lot of new errors, however, the majority of them stem from tests purposely designed to check 'insensible' conversions (e.g. readgzfile($filename, -10.5)), and not code blocks that we're ever likely to bump into in the real world.
The Symfony and Zend Framework test suites were also run and showed new deprecation errors; Based on very preliminary analysis, it seems that most of them either fall into the same bucket as the PHP unit tests above (purposely designed to test insensible conversions), or seem to point out issues that may translate into real world bugs, and that can be fixed in a relatively small number of changes.
All in all, the signal to noise ratio of turning the new coercive rules for the entirety of PHP seems to be very good.
The following table details the changes made to the values acceptable by internal functions proposed by this patch (dash (-) means there were no changes):
Value Type | ||||
---|---|---|---|---|
Hint | boolean | int | float | string |
bool | Unchanged | Unchanged | Reject | Unchanged |
int | Reject | Unchanged | Restrict* | Restrict† |
float | Reject | Unchanged | Unchanged | Restrict† |
string | Reject | Unchanged | Unchanged | Unchanged |
* Only accept inputs that contain no significant digits after the decimal point.
† Numeric strings that have non-blank alphanumeric characters (e.g., “7 dogs”, “3.14 pizzas”) are no longer accepted.
Note that in all cases, when conversion occurs - its rules are identical to those in PHP 5. PHP 7 will accept fewer types of inputs as valid, but will apply the same conversion rules to the ones that are accepted.
Here are examples of conversions which, while still providing the same results as in PHP 5, now also raise an E_DEPRECATED error :
false -> int # No more conversion from bool true -> string # No more conversion from bool 7.5 -> int # 7.5 cannot be converted to an integer without data loss "8.2" -> int # "8.2" cannot be converted to an integer without data loss 4.3 -> bool # No more conversion from float to bool "7 dogs" -> int # Non-blank trailing characters no longer supported "3.14 pizzas" -> float # Non-blank trailing characters no longer supported
While outside the scope of this RFC, the introduction of the new coercive-yet-more-restrictive rule-set may be considered for additional areas in PHP, most notably implicit casting. For example, today, the result of “Apples” + “Oranges” is 0, because the + operator implicitly casts anything into a number. It could be imagined that in the future, the + operator will accept only values that would fit into an int or float STH, and warn users about others (realistically, most probably through E_STRICT). Users would still be able to use permissive explicit casting ($foo = (int) “Apples”; would still assign 0 into $foo), but the risk sometimes associated with implicit casting will be eliminated.
It should be noted that nothing in this RFC conflicts with the ability to add Strict STH in the future. The ability to add a 2nd mode via declare() or some other mechanism will always be there. We do believe that demand for it will greatly diminish with the introduction of these scalar type hints, but in case it doesn't - there'll be no technical blocks preventing us from adding it in the future, even in the 7.x lifetime.
Numerous community members have invested substantial effort into creating another comprehensive RFC, that proposes to introduce STH into PHP Scalar Type Hints RFC v0.5 ("Dual Mode RFC"). However, we believe the proposal in this RFC is better, for several different reasons:
In addition, there appear to be numerous misconception about benefits of strict type hinting, that to the best of our (deep) understanding of the associated technologies, aren't really there:
Given the change to the acceptable values into a wide range of internal functions, this RFC is likely to result in a substantial number of newly introduced E_DEPRECATED warnings in internal function invocations, although those can be easily suppressed. When E_DEPRECATED is replaced with E_RECOVERABLE_ERROR in a future PHP version, users will be forced to update their code and 'clean it up' before they can upgrade. Also, the newly-introduced type hints (int, float, string and bool) will no longer permitted as class/interface/trait names (including with use and class_alias)
7.0
The voting choices are yes (in favor for accepting this RFC for PHP 7) or no (against it). The RFC proposes a very substantial change to PHP's coercion rules, which may evolve to affect implicit typing in the future. It absolutely requires a 2/3 majority, with the hope of reaching as close as possible to consensus. The vote starts on March 11th, and will end two weeks later, on March 25th.