rfc:typecheckingstrictandweak

This is an old revision of the document!


Request for Comments: Strict and weak parameter type checking

This RFC is provide a proposal for both weak and strict parameter type checking for function/method parameters and why providing only strict type checking would be a mistake.

Introduction

Several people still have asked to expand array/object type hinting to cover other data types, which mostly ask for similar strict type checking (without any type juggling) as for arrays and objects, while also triggering an E_RECOVERABLE_ERROR for failed checks. However this means that the burden for explicit type casting is now on the user of the function/method. This RFC tries to address this issue.

Why is strict type checking problematic?

Proponents of only providing strict type checking say that for the most part variables are defined with the proper type unless they come from an outside source, which usually requires validation anyways, which is a perfect opportunity to type cast. For example, to define a variable that contains a boolean, developer will probably do “$is_foo = true” and not “$is_foo = 0”. While this may be true, it does means that developers using such strict type checking API's now require that users understand data types, which currently beginning developers do not necessarily need to.

Furthermore quite often developers need to parse content out of strings and pass this to other methods. With strict type checking one is now forced to explicitly type cast. While its certainly doable, its also additional work that needs to be done while writing the code (“$foo_int = (int)substr($bar, 3, 10)”). Then again some might argue that this makes the code clearer.

It also means that users of such strict typed API's will tend to simply cast and due to laziness (PHP is used for rapid development after all) might forgo validating first if the content is really what they expected. Without type checking the burden would be with the developer providing the API. Since its usually expected that an API is fairly often, it seems illogic to move this burden to the API users. More over due to this, a new kind of bug will be introduced due to over use of cast instead of hand coded parameter validation as is currently necessary. This could lead to even higher bug rates.

As for outside sources needing validation. This is not always the case as most people do trust that the data returned from a database is in the expected format, even though for most RDBMS it will always be returned as string. Same applies to configuration files, which if defined in something else than PHP code will most likely only return strings, but who's values will usually not be validated. Even with form data - the auto-converting weak type checking detailed here can, in many cases - nullify the need for additional output sanitization.

Introducing weak type checking

In Ilia's recent strict type checking proposal, he did include a “numeric” and a “scalar” data type, which tried to reducing the above noted issues with strict type checking. The “numeric” type would behave similar to the “is_numeric()” function in that it would not check the type, but would also accept a string with only numbers or a float (see the documentation for the exact definition). In the same way “scalar” would simply check if the parameter is not an array, object or resource.

However it does not cover all specific data types. Moreover “numeric” is not a known data type and is also significantly longer to type than “int”. As a result it seems likely that “int” will be used by many developers even where “numeric” would suffice. As a result a new concept was introduced to simply allow a syntax to define if the check should be strict or weak.

A weak check would examine the content of the variable in a way that would be more strict than the standard type juggling. If the weak check passes, the value would be type casted appropriately. If the weak check fails it would trigger an E_RECOVERABLE_ERROR just as in the strict case.

Here is a short list of examples to illustrate the weak type checking. Note that just like the current array/object hints, a NULL is only allowed if the parameter defaults to NULL.

value string float int numeric scalar bool array
true (boolean) fail fail fail fail pass pass fail
false (boolean) fail fail fail fail pass pass fail
0 (integer) fail pass pass pass pass pass fail
1 (integer) fail pass pass pass pass pass fail
12 (integer) fail pass pass pass pass fail fail
12 (double) fail pass fail pass pass fail fail
12.34 (double) fail pass fail pass pass fail fail
'true' (string) pass fail fail fail pass fail fail
'false' (string) pass fail fail fail pass fail fail
'0' (string) pass fail fail pass pass pass fail
'1' (string) pass fail fail pass pass pass fail
'12' (string) pass fail fail pass pass fail fail
'12abc' (string) pass fail fail fail pass fail fail
'12.0' (string) pass fail fail pass pass fail fail
'12.34' (string) pass fail fail pass pass fail fail
'foo' (string) pass fail fail fail pass fail fail
array () (array) fail fail fail fail fail fail pass
array (0 => 12) (array) fail fail fail fail fail fail pass
NULL (NULL) fail fail fail fail fail fail fail
'' (string) pass fail fail fail pass fail fail

Further more weak type checking could also be useful once we have generic type casting support via some magic type cast method along the lines of __toString(). In this case the weak type checking would also allow an object to pass if it provides the relevant casting method, though it would then of course automatically cast the object to the given type.

Proposed API

// "+' denotes strict and "-" denotes weak type checking
function add_user(+string name, +string phone_number, -int age) { .. }
 
// "!" denotes strict type checking and "?" denotes weak type checking
function add_user(string name, !string phone_number, ?int age) { .. }
 
// "~" denotes weak type checking
function add_user(string name, string phone_number, ~int age) { .. }
 
// "()" denotes weak type checking
function add_user(string name, string phone_number, (int) age) { .. }
 
// Keep in mind that the "modifier" can either be placed at the start or end
function add_user(string! name, string! phone_number, int? age) { .. }
 
// Furthermore one of the two modifiers could be the default
// optionally + is the default
function add_user(+string name, string phone_number, -int age) { .. }
// optionally - is the default
function add_user(+string name, +string phone_number, int age) { .. }

Changelog

rfc/typecheckingstrictandweak.1274543144.txt.gz · Last modified: 2017/09/22 13:28 (external edit)