This RFC is waiting for the decisions that will be made about scalar type hinting. The reason is that the design and syntax decisions that will be made about scalar type hinting heavily impact the contents of this RFC. Proposal is subject to be changed according scalar type hinting implementation.
This RFC is part of “Design by Contract Introduction” RFC
There is alternative implementation proposal by “Definition”
The original idea of introducing DbC in PHP comes from Yasuo Ohgaki yohgaki@ohgaki.net.
Then, I offered to write an RFC where I propose to include DbC constraints in doc comments. This is the present document.
While we agree on the concept, Yasuo is preferring a D-like syntax, which he's proposing in another RFC. IMO, adopting the D syntax would be fine if we designed the language from scratch, but is not the best way to include the concept in PHP (more details below).
For more than 10 years (since PHP 5 was released), the PHP core community has seen a lot of discussions about strict vs loose typing, type hinting and related features. Through these discussions, developers are actually searching for a way to help reduce coding errors by detecting them as early as possible. Strictifying types is an approach but, unfortunately, it does not fit so well with PHP as a loose-typed language.
This RFC proposes an alternative approach, already present in several languages, named 'Design by Contract' (reduced to 'DbC' in the rest of the document).
Here is the definition of a contract, according to the D language documentation :
The idea of a contract is simple - it's just an expression that must evaluate to true. If it does not, the contract is broken, and by definition, the program has a bug in it. Contracts form part of the specification for a program, moving it from the documentation to the code itself. And as every programmer knows, documentation tends to be incomplete, out of date, wrong, or non-existent. Moving the contracts into the code makes them verifiable against the program.
For more info on the DbC theory, use the links in the 'reference' section below.
An important point in DbC theory is that contracts are checked during the development/debugging phase only. A global switch allows to turn DbC checks off when the software goes to production.
So, what we need to retain :
First, an example of a function defining input and output constraints ('$>' means 'return value'). This example is adapted from the D language.
//=========================================================================== /** * Compute area of a triangle * * This function computes the area of a triangle using Heron's formula. * * @param number $a Length of 1st side * @requires ($a >= 0) * @param number $b Length of 2nd side * @requires ($b >= 0) * @param number $c Length of 3rd side * @requires ($c >= 0) * @requires ($a <= ($b+$c)) * @requires ($b <= ($a+$c)) * @requires ($c <= ($a+$b)) * * @return number The triangle area * @ensures ($> >= 0) */ function triangleArea($a, $b, $c) { $halfPerimeter = ($a + $b + $c) / 2; return sqrt($halfPerimeter * ($halfPerimeter - $a) * ($halfPerimeter - $b) * ($halfPerimeter - $c)); }
Then :
$area=triangleArea(4,2,3); -> OK $area=triangleArea('foo',2,3); -> PHP Fatal error: triangleArea: DbC input type mismatch - $a should match 'number' (string(3) "foo") in xxx on line nn $area=triangleArea(10,2,3); -> PHP Fatal error: triangleArea: DbC pre-condition violation ($a <= ($b+$c)) in xxx on line nn
Another example with a PHP clone of str_replace() :
//=========================================================================== /** * Replace all occurrences of the search string with the replacement string * * This function returns a string or an array with all occurrences of search * in subject replaced with the given replace value. * * @param string|array(string) $search The value being searched for (aka needle) * @param string|array(string) $replace The replacement value that replaces found search values * @param string|array(string) $subject The string or array being searched and replaced on * @param.out int $count The number of replacements performed * @ensures ($count >= 0) * @return string|array(string) A string or an array with the replaced values * * Ensure that returned value is the same type as input subject : * @ensures (is_array($>)===is_array($subject)) */ function str_replace($search, $replace, $subject, &$count=null) { ...
Note that we didn't provide any constraint on $count input, as this parameter is used for output only.
Finally, we rewrite the first example as a class :
<?php /** * @invariant ($this->a >= 0) && ($this->a <= ($this->b+$this->c)) * @invariant ($this->b >= 0) && ($this->b <= ($this->a+$this->c)) * @invariant ($this->c >= 0) && ($this->c <= ($this->b+$this->a)) */ class triangle { /*-- Properties */ /** @var number Side lengths */ private $a,$b,$c; //--------- /** * @param number $a Length of 1st side * @param number $b Length of 2nd side * @param number $c Length of 3rd side * * No need to repeat constraints on values as they are checked by class invariants. */ public function __construct($a,$b,$c) { $this->a=$a; $this->b=$b; $this->c=$c; } //--------- /** * Compute area of a triangle * * This function computes the area of a triangle using Heron's formula. * * @return number The triangle area * @ensures ($> >= 0) */ public function area() { $halfPerimeter = ($this->a + $this->b + $this->c) / 2; return sqrt($halfPerimeter * ($halfPerimeter - $this->a) * ($halfPerimeter - $this->b) * ($halfPerimeter - $this->c)); }
and check DbC constraints :
$t= new triangle(4,2,3); -> OK $t=new triangle('foo',2,3); -> PHP Fatal error: triangle::__construct: DbC input type mismatch - $a should match 'number' (string(3) "foo") in xxx on line nn $area=triangleArea(10,2,3); -> PHP Fatal error: triangle: DbC invariant violation (($this->a >= 0) && ($this->a <= ($this->b+$this->c)) in xxx on line nn
DbC defines three constraint types :
In this document, we propose a mechanism to implement these constraints in the PHP world.
We propose to include the DbC directives in phpdoc blocks. Here are the main reasons, that make it, in my opinion, a better choice than every other syntaxes proposed so far :
Note: Some people on the mailing list are religiously opposed to including information in phpdoc blocks, despite the fact that thousands of people already use them for this purpose. The reason is that the parser cannot handle that. I agree, but that's not a task for the parser, that's a task for an external tool. We just need the hooks.
As DbC, by nature, can be turned on and off, DbC checks must not modify anything in the environment.
While enforcing this is partially possible in theory, this implementation will leave it to the developer's responsibility, as most languages do.
DbC types are an extension and formalization of the pre-existing phpdoc argument/return types.
DbC types are not present in original DbC syntax (like Eiffel or D implementation), which are based on conditions only. This is a PHP-specific addition to enhance simplicity and readability. DbC types can be seen as built-in conditions.
Here are the main benefits of defining a set of DbC types :
DbC types are used to check :
DbC types don't contain whitespaces.
Here is a pseudo-grammar of DbC types :
dbc-type = compound-type compound-type = type, { "|", type } type = "integer" | "integer!" | "number" | "float!" | "string" | "string!" | array-type | "callable" | object-type | resource-type | "null" | "scalar" | "mixed" | "boolean" | "boolean!" array-type = "array" | "array(", compound-type, ")" object-type = "object" | "object(", class-name, ")" resource-type = "resource" | "resource(", resource-name ")"
DbC types follow specific rules to match PHP zvals. These rules are less permissive than PHP API type juggling and previously-proposed scalar 'weak' typing, but more than previously-proposed strict typing. Actually, these types try to be a more intuitive compromise between both.
Strict typing is sometimes required. That's why DbC types also include a set of strict types.
Note that the benefit of DbC, here, is that we can match depending on zval values, as we don't care about performance.
Zval type | ||||||||
---|---|---|---|---|---|---|---|---|
DbC type | IS_NULL | IS_LONG | IS_DOUBLE | IS_BOOL(1) | IS_ARRAY | IS_OBJECT | IS_STRING | IS_RESOURCE |
integer | No | Yes | (2) | No | No | No | (3) | No |
integer! | No | Yes | No | No | No | No | No | No |
number | No | Yes | Yes | No | No | No | (4) | No |
float! | No | No | Yes | No | No | No | No | No |
string | No | Yes | Yes | No | No | (6) | Yes | No |
string! | No | No | No | No | No | (6) | Yes | No |
array | No | No | No | No | Yes | No | No | No |
callable | No | No | No | No | (5) | (5) | (5) | No |
object | No | No | No | No | No | Yes | No | No |
resource | No | No | No | No | No | No | No | Yes |
scalar | No | Yes | Yes | Yes | No | No | Yes | No |
null | Yes | No | No | No | No | No | No | No |
mixed | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
boolean | No | (7) | (7) | Yes | No | No | No | No |
boolean! | No | No | No | Yes | No | No | No | No |
An integer value, positive or negative.
Note: This type is NOT equivalent to is_int($arg), as is_int() only accepts the IS_LONG zval type.
Synonyms: 'int'
A zval-type-based integer value, positive or negative.
Note: This type is equivalent to is_int($arg).
Synonyms: 'int!'
Any value that returns true through is_numeric().
Equivalent to 'is_numeric($arg)'.
Synonyms: 'numeric', 'float'
A zval-type-based float value.
Note: This type is equivalent to is_float($arg).
An entity that can be represented by a string. Numeric values are accepted as strings, as well as objects whose class defines a toString() method. == string! == Accepts IS_STRING zvals and objects whose class defines a toString() method.
A PHP array.
Complements: Can be followed by a 'compound-type', enclosed in parentheses. This defines the acceptable types of the array values. This definition can be nested.
Examples:
* @param array $arr ... * @param string|array(string) $... # Matches a string or an array of strings * @param array(array(string|integer)) $... # A 2-dimension array containing strings and int only
A string, object or array returning true through 'is_callable($arg,true)'.
Please consult the is_callable() documentation for more details.
An instance object.
Synonyms: 'obj'
Complements: Can be followed by a class name, enclosed in parentheses. Match will occur if the object is of this class or has this class as one of its parents (equivalent to is_a()).
Examples:
* @param object $arg * @param object(Exception) $e * @param object(MongoClient)|null $conn
A PHP resource.
Synonyms: 'rsrc'
Complements: Can be optionally followed by a resource type. A resource type is a string provided when defining a resource via zend_register_list_destructors_ex(). As we don't support whitespaces in argument types, whitespaces present in the original resource type must be replaced with an underscore character ('_').
The easiest way to display the string corresponding to a resource type is to display an existing resource using var_dump().
Examples:
* @param resource(OpenSSL_key) $... * @param resource(pgsl_link) $...
Shortcut for 'numeric|boolean|string'.
Equivalent to 'is_scalar()'.
This corresponds exactly to the IS_NULL zval type.
Equivalent to 'is_null($arg)'.
Note that a number with a 0 value does not match 'null'.
Synonyms: 'void' (mostly used for return type)
Examples:
* @param string|null $... * @param resource(pgsl_link) $... * @return null
Accepts any zval type & value (catch-all).
Synonyms: 'any'
A boolean value (true or false).
In PHP 7, IS_BOOL is replaced with IS_TRUE and IS_FALSE.
Equivalent to 'is_bool($arg)'.
Synonyms: 'bool'
Accepts IS_BOOL zvals only (IS_TRUE/IS_FALSE on PHP 7).
Synonyms: 'bool!'
These conditions are checked at the beginning of a function or method, after arguments have been received, but before starting executing the function body.
Pre-conditions are expressed in two forms : argument types, and explicit assertions. Argument types are used first and explicit assertions supplement argument types with additional conditions (like conditions between arguments).
Argument types are checked before explicit assertions, meaning that explicit assertions can assume correct types.
When an optional argument is not set by the caller, its input (and possibly output) types are not checked. This allows to set a default value which does not match the argument's declared input type.
Example :
/** * ... * @param int $flag ... * ... */ function myFunc(..., $flag=null) { if (is_null($flag)) { // Here, we are sure that the parameter was not set by the caller, as // a null value sent by the caller would be refused by DbC input check. ...
These conditions supplement argument types for more complex conditions. They are executed in the function scope before executing the function's body.
Syntax :
/** * ... * @requires <php-condition> * ...
where <php-condition> is a PHP expression whose evaluation returns true or false.
These assertions can appear anywhere in the phpdoc block. They are executed in the same order as they appear in the doc block.
The DbC theory, in accordance with the LSP, states that a subclass can override pre-conditions only if it loosens them.
The logic we implement is in the spirit of the way PHP handles class constructors/destructors :
Post-conditions are checked at function's exit. Like pre-conditions, they are executed in the function scope.
They are generally used to check the returned type and value, and arguments returned by ref.
When a function exits because an exception was thrown, the function's post-conditions are not checked, but class constraints are checked.
Syntax:
* @return <compound-type> [free-text]
The syntax of <compound-type> is the same as argument types.
Examples:
* @return resource|null // For a factory: * @return object(MyClass)
This is the return type & value of the arguments passed by reference.
Syntax:
* @param.out <compound-type> $<arg-name> [free-text]
Note that an argument passed by reference can have a '@param' line to define its input type and/or a '@param.out' line to define its output type. In the str_replace() example above, we don't define an input type for $count because it is undefined.
Syntax:
* @ensures <condition>
As with input assertions, <condition> is a PHP condition that will be executed in the function scope. The only addition is that the '$>' string will be replaced with the function's return value before evaluation.
As with pre-conditions, output types are checked before output assertions.
The inheritance rules are the same as the ones for pre-conditions.
Unlike the Eiffel or D implementations, parent post-conditions will be checked only if the child requires it using a '@ensures @parent' directive.
These constraints are called 'invariants' in the DbC litterature. The idea is that properties must always verify a set of 'invariant' conditions.
Class constraints take two forms : property types and class assertions.
Each property type is defined in its own docblock, just before the definition of its property and class assertions are defined in the class docblock (the block just before the class definition).
Note that we don't define a specific constraint type for static properties. They will be checked using the same syntax as dynamic properties.
Syntax:
/** @var <compound-type> [free-text] */
where <compound-type> follows the same syntax as argument types.
These are defined in class docblocks.
Syntax:
* @invariant <condition>
<condition> must use '$this->' to access dynamic properties and 'self::' to access static properties.
Property types are checked before class assertions.
This set of constraints is checked :
Class constraints are executed before pre-conditions and/or after post-conditions.
These constraints are executed in the class scope ('$this' and 'self' can be used).
The same mechanism is used as with pre/post-conditions. Parent constraints are checked only if explicitely called using '@invariant @parent'.
When a function or method is called from a DbC condition, its constraints are not checked.
When a DbC condition fails, an E_ERROR is raised, containing the file and line number of the failing condition.
None
As the plan is to implement this in a separate extension, it should be availbale for PHP 5 ans PHP 7.
None
None
None
None
A boolean whose name is still undefined.
When DbC is turned off, there's no change in PHP behavior.
Required majority ? To be defined.
This should be implemented in a Zend extension, not in the core. This would be a perfect addition for XDebug.