This is an old revision of the document!
PHP RFC: Scalar Type Hints
- Version: 0.1.2
- Date: 2014-12-14 (initial draft; put Under Discussion 2014-12-31)
- Author: Andrea Faulds, ajf@ajf.me
- Status: Under Discussion
- First Published at: http://wiki.php.net/rfc/scalar_type_hints
Introduction
This RFC proposes the addition of four type hints for scalar types: int
, float
, string
and bool
. These follow the same casting rules as used for internal functions (i.e. those defined by PHP extensions and written in native code).
Example
Let's say we have a PHP class that represents an ElePHPant. We put scalar type hints on our constructor arguments:
class ElePHPant { public $name, $age, $cuteness, $evil; public function __construct(string $name, int $age, float $cuteness, bool $evil) { $this->name = $name; $this->age = $age; $this->cuteness = $cuteness; $this->evil = $evil; } }
We can then create a new instance like this, and it's valid since the parameter types exactly match:
$sara = new ElePHPant("Sara", 7, 0.99, FALSE); var_dump($sara); /* Output: object(ElePHPant)#1 (4) { ["name"]=> string(4) "Sara" ["age"]=> int(7) ["cuteness"]=> float(0.99) ["evil"]=> bool(false) } */
We could also pass values that are convertible and they'll be converted, just like with extension functions:
$nelly = new ElePHPant(new Stringable("Nelly"), "7 years", "0.9", "1"); var_dump($nelly); /* object(ElePHPant)#2 (4) { ["name"]=> string(5) "Nelly" ["age"]=> int(7) ["cuteness"]=> float(0.9) ["evil"]=> bool(true) } PHP Notice: A non well formed numeric value encountered in Command line code on line 1 */ $evan = new ElePHPant(1234, "9", 0.3, 0); var_dump($evan); /* object(ElePHPant)#3 (4) { ["name"]=> string(4) "1234" ["age"]=> int(9) ["cuteness"]=> float(0.3) ["evil"]=> bool(false) } */
(Stringable definition)
Not all values are convertible, however, so the following would error:
$foo = new ElePHPant([], new StdClass, fopen("data:text/plain,foobar", "r"), NULL); // Catchable fatal error: Argument 1 passed to ElePHPant::__construct() must be of the type string, array given
Background
PHP has had parameter type hints for class names since PHP 5.0, arrays since PHP 5.1 and callables since PHP 5.4. Unfortunately, PHP's scalar types haven't been hintable. This has meant that the signatures of functions which take scalar arguments lack type information, requiring workarounds such as docblocks to document the parameter types, and requiring programmers to validate or convert arguments manually.
Previous attempts at adding scalar type hints, such as the Scalar Type Hints with Casts RFC, have failed. In particular, that specific proposal was inconsistent with the type conversion rules used in other parts of the language. However, this RFC follows exactly the same conversion rules as (and shares the implementation used by) functions defined by native code extensions, with the exception of the handling of NULL
(see the Details section). Thus, it avoids the problem of inconsistency.
To quote Rasmus:
PHP is and should remain:
1) a pragmatic web-focused language
2) a loosely typed language
3) a language which caters to the skill-levels and platforms of a wide range of users
Input coming from the web, such as query string parameters or POST bodies, is likely to be in string form. By performing conversion from strings automatically, just as with existing extension functions, this RFC is in keeping with PHP being a web-focused language. By allowing conversion instead of requiring strict type matches, this RFC is in keeping with PHP being a loosely-typed language. Finally, by not forcing users to worry about type conversions, it keeps the language accessible to beginners, keeping PHP a language catering to all skill-levels. Therefore, I feel that this RFC keeps all three of these principles true.
Strict scalar type hints, which would only accept exactly matching types, have been previously proposed. However, they have several issues. Firstly, they would be inconsistent with the behaviour of internal and extension functions, which use weak parameter types like this RFC. In the PHP manual we already use the syntax proposed in this RFC to describe the behaviour of extension functions. Yet, if strict hints were added, the syntax used in the manual would do something completely different for userland functions. This would doubtless be very confusing for new users. Secondly, they might lead to brittle code. In PHP, types are routinely juggled: for example, the mathematical operators return different types depending on input values. This “weak typing” is not a problem with PHP and is one of the features that makes PHP such a pleasant language to use. However, it does not work well with strict type hints, which will throw errors when the wrong type is passed. A function call like foobar($a + 2)
, where foobar
had a strict integer type hint, might succeed on 64-bit systems but throw an error and crash on 32-bit systems, because somewhere earlier during function excution, PHP converted $a to a float (which provides an extra 22 bits of precision) because it exceeded the limits of an integer in an arithmetic operation. This is less likely to cause issues with weak type hints like those proposed by this RFC, if the value could be converted to an integer. Third, strict hints prevent the addition of type hints to existing libraries, making scalar type hints useless to maintainers of existing code unless they wish to introduce massive backwards-compatibility breaks. Much existing code takes advantage of PHP's weak typing (even if unintentionally) and would break if the libraries used added strict type hints. On the other hand, weak type hints like those proposed in this RFC could be safely added to existing projects with minimal issues. These hints would not be useless, either: they could still catch errors early, and would make function execution more predictable because the argument would always be the correct type inside the function body. Finally, strict hints encourage dangerous behaviour, such as using explicit casts ((int)
, (float)
etc.). In order to make existing (and also new) code work with strict hints, casts would need to be added to convert values to the correct type, such that they fulfil the type hints. The problem here is that the explicit cast operators never throw errors (with the minor exception of object->scalar casts). Almost any PHP value of any type will be accepted and “converted”, whether the result is meaningful or not. In some cases this may even defeat the purpose of strict type hints, because type errors cannot be caught when they are explicitly silences. Implicit casts, on the other hand, are far less dangerous in PHP. Implicit casts can fail and throw an error if the conversion is not meaningful. For example, an empty string would not be accepted for an integer parameter, and a numeric string with trailing characters would produce a notice, whereas an explicit cast would produce no error and simply produce 0
. In this respect, weak type hints can offer better safety than strict hints.
No type hint for resources is added, as this would prevent moving from resources to objects for existing extensions, which some have already done (e.g. GMP).
For the integer typehint, both the int
and integer
syntaxes are allowed, and for the boolean typehint, both bool
and boolean
are allowed. This has been done because PHP uses both throughout the manual and error messages, so there is no clear choice of syntax that wouldn't cause problems. While in an ideal world we would not need to support these aliases, the likelihood of people being caught out by integer
or boolean
not working is very high, so I feel we ought to support both the short and long forms of these type names.
Details
No new reserved words are added. The names int
, integer
, float
, string
, bool
and boolean
are recognised and allowed as type hints, and prohibited from use as class/interface/trait names. When they are used, the validation and conversion functions used by the Fast Parameter Parsing API are called internally. Thus, they exactly match the behaviour of zend_parse_parameters
. The only exception to this is the handling of NULL
: in order to be consistent with our existing type hints for classes, callables and arrays, NULL
is not accepted by default, unless the parameter is explicitly given a default value of NULL
. This would work well with the draft Declaring Nullable Types RFC.
Casting and Validation Rules
While this RFC merely follows PHP's existing rules for scalar parameters, used by extension functions, these rules may not be familiar to all readers of this RFC. For that reason, here is a summary of which types are accepted. Note that NULL
, arrays and resources are never accepted for scalar type hints, and so are not included in the table. These rules are the same as those used by extension functions, except for the handling of NULL (see above).
Type hint | integer | float | string | boolean | object |
---|---|---|---|---|---|
integer | yes | yes* | yes† | yes | no |
float | yes | yes | yes† | yes | no |
string | yes | yes | yes | yes | yes‡ |
boolean | yes | yes | yes | yes | no |
*Only non-NaN floats between PHP_INT_MIN
and PHP_INT_MAX
accepted. (New in PHP 7, see the ZPP Failure on Overflow RFC)
†Non-numeric strings not accepted. Numeric strings with trailing characters are accepted, but produce a notice.
‡Only if it has a __toString
method.
Backward Incompatible Changes
int
, integer
, float
, string
, bool
and boolean
are no longer permitted as class/interface/trait names.
Because these hints are quite permissive in the values they accept and behave similarly to PHP's type juggling for operators, it should be possible for existing userland libraries to add scalar type hints without breaking compatibility.
Proposed PHP Version(s)
This is proposed for the next PHP x, currently PHP 7.
RFC Impact
To Existing Extensions
ext/reflection
will need to be updated in order to support type hint reflection for parameters. This hasn't yet been done.
Unaffected PHP Functionality
This doesn't affect the behaviour of cast operators.
Open Issues
There are two open issues related to naming. These might be voted on if consensus isn't reached.
- Currently, this RFC and patch allows the aliases
integer
andboolean
in addition toint
andbool
. Should we only allowint
andbool
? It is probably not a good idea to add too many new reserved class names. On the other hand, we useinteger
andboolean
in many places in the manual, and programmers would be forgiven for expectinginteger
andboolean
to work. We could opt to reserve them but prevent their use, telling people to useint
andbool
instead. That wouldn't reduce the number of prohibited class names, but it would prevent confusion and ensure consistency.
- Should the scalar type hint names be prohibited from use as class names? The patch currently prohibits this (
class int {}
is an error), to avoid the situation where you can declare a class with the name of a scalar type hint yet not type hint against it (as the name would be interpreted as a scalar hint). Personally, I think it'd be best to avoid confusion and prevent classes from having the same names as scalar types. However, if this causes significant backwards-compatibility problems, we might have to allow it. I would note that at least some of the existing classes with such names are used as a stand-in for scalar type hints.- The patch doesn't currently do this, but it would make sense to also prevent scalar type hint names being used with the
use
statement.
Future Scope
If return types were added, such as with the Return Type Hinting RFC, scalar type hints should be supported. A possible matter of debate would be whether or not to allow conversions in that case, given that some of the reasons cited for parameter type conversion may not be applicable. For consistency, also casting the return value is probably the way to go.
Because scalar type hints guarantee that a passed argument will be of a certain type within a function body (at least initially), this could be used in the Zend Engine for optimisations. For example, if a function takes two float
-hinted arguments and does arithmetic with them, there is no need for the arithmetic operators to check the types of their operands. As I understand it, HHVM already does such optimisations, and might benefit from this RFC.
Proposed Voting Choices
As this is a language change, this RFC requires a 2/3 majority to pass. It will be a Yes/No vote.
Patches and Tests
There is a working, but incomplete php-src pull request that has tests here: https://github.com/php/php-src/pull/972
There is an incomplete language specification pull request (lacking tests) here: https://github.com/php/php-langspec/pull/109
Note that the patch for php-src allows these hints for internal/extension functions as well. This might be useful in future to allow bypassing zend_parse_parameters
for new functions. It produces E_RECOVERABLE_ERROR
on failure, so it couldn't be used for existing functions.
Implementation
After the project is implemented, this section should contain
- the version(s) it was merged to
- a link to the git commit(s)
- a link to the PHP manual entry for the feature
References
Changelog
- v0.1.2 - Noted some downsides of strict hints vs weak hints
- v0.1.1 - Added table summarising casting and validation rules
- v0.1 - Initial drafts