rfc:typechecking

This is an old revision of the document!


Request for Comments: How to write RFCs

This RFC is provide a proposal for both weakly and strong type checking for function/method parameters.

Introduction

PHP is a dynamically typed language, that does automatic type juggling where ever possible. With PHP 5 a new feature called "type hinting" was introduced for arrays and objects. However the name “hint” is a bit misnamed, since triggers an E_RECOVERABLE_ERROR. While a fatal error can be prevented with an error handler, its not really nicely doable, since this needs to happen in a global error handler, far away in terms of code from the original context. It also comes with quite an overhead.

Several people still have asked to expand this feature to cover other data types, which mostly ask for similar strict type checking (without any type juggling) as for arrays and objects, while also triggering an E_RECOVERABLE_ERROR for failed checks. However this means that the burden for explicit type casting is now on the user of the function/method. This RFC tries to address this issue.

Why is strict type checking problematic?

Proponents of purely strict type checking say that for the most part variables are defined with the proper type unless they come from an outside source, which usually requires validation anyways, which is a perfect opportunity to type cast.

That is to define a variable that contains a boolean, developer will probably do “$is_foo = true” and not “$is_foo = 0”. While this may be true, it does means that developers using such strict type checking API's now require that users understand data types, which currently beginning developers do not necessarily need to.

Furthermore quite often developers need to parse content out of strings and pass this to other methods. With strict type checking one is now forced to explicitly type cast. While its certainly doable, its also additional work that needs to be done while writing the code (“$foo_int = (int)substr($bar, 3, 10)”). Then again some might argue that this makes the code clearer.

It also means that users of such strict typed API's will tend to simply cast and due to laziness might forgo validating first if the content is really what they expected. Without type checking the burden would be with the developer providing the API. Since its usually expected that an API is fairly often, it seems illogic to move this burden to the API users.

At the same time strict type checking does have the advantage that subtle bugs will be noticed more quickly and that function/method signatures will become yet more self documenting and therefore more expressive. Also doing these type checks based on the signature also means less code and better performance over having to hand code the validation

As for outside sources needing validation. This is not always the case as most people do trust that the data returned from a database is in the expected format, even though for most RDBMS it will always be returned as string. Same applies to configuration files, which if defined in something else than PHP code will most likely only return strings, but who's values will usually not be validated.

Introducing weak type checking

In Ilia's recent strict type checking proposal, he did include a “numeric” and a “scalar” data type, which tried to reducing the above noted issues with strict type checking. The “numeric” type would behave similar to the “is_numeric()” function in that it would not check the type, but would also accept a string with only numbers or a float (see the documentation for the exact definition). In the same way “scalar” would simply check if the parameter is not an array, object or resource.

However it does not cover all specific data types. Moreover “numeric” is not a known data type and is also significantly longer to type than “int”. As a result it seems likely that “int” will be used by many developers even where “numeric” would suffice. As a result a new concept was introduced to simply allow a syntax to define if the check should be strict or weak.

A weak check would examine the content of the variable in a way that would be more strict than the standard type juggling. If the weak check passes, the value would be type casted appropriately. If the weak check fails it would trigger an E_RECOVERABLE_ERROR just as in the strict case.

Here is a short list of examples to illustrate the weak type checking

// pass a weak integer type check
$foo = 12;
$foo = 12.00;
$foo = '12';
$foo = "-12.00";
 
// pass a weak float type check
$foo = 12;
$foo = 12.00;
$foo = 12.34;
$foo = '12';
$foo = "-12.00";
$foo = "12.34";
 
// pass a weak bool type check
$foo = true;
$foo = false;
$foo = 0;
$foo = "0";
$foo = 1;
$foo = "1";
 
// pass a weak bool type check
$foo = true;
$foo = false;
$foo = 0;
$foo = "0";
$foo = 1;
$foo = "1";

Further more weak type checking could also be useful once we have generic type casting support via some magic type cast method along the lines of __toString(). In this case the weak type checking would also allow an object to pass if it provides the relevant casting method, though it would then of course automatically cast the object to the given type.

Proposed API

// "+' denotes strict and "-" denotes weak type checking
function add_user(+string name, +string phone_number, -int age) { .. }
 
// "!" denotes strict type checking and "?" denotes weak type checking
function add_user(string name, !string phone_number, ?int age) { .. }
 
// "~" denotes weak type checking
function add_user(string name, string phone_number, ~int age) { .. }
 
// "()" denotes weak type checking
function add_user(string name, string phone_number, (int) age) { .. }
 
// Keep in mind that the "modifier" can either be placed at the start or end
function add_user(string! name, string! phone_number, int? age) { .. }
 
// Furthermore one of the two modifiers could be the default
// optionally + is the default
function add_user(+string name, string phone_number, -int age) { .. }
// optionally - is the default
function add_user(+string name, +string phone_number, int age) { .. }

Changelog

rfc/typechecking.1246652775.txt.gz · Last modified: 2017/09/22 13:28 (external edit)