rfc:dbc

This is an old revision of the document!


PHP RFC: Implementing Design by Contract

Introduction

For more than 10 years (since PHP 5 was released), the PHP core community has seen a lot of discussions about strict vs loose typing, type hinting and related features.

To summarize years of flame wars, developers argue that strict typing and type hinting will make their source code cleaner and easier to debug. On the other side, these features must remain optional and compatible with 'basic' loose-typed PHP syntax. The debate generally dies in endless discussions about the concept of 'number', 'int', 'float'...

With this RFC, we propose an alternative approach, already present in several languages, named 'design by contract' (reduced to 'DbC' in the rest of the document).

We won't detail the concept of DbC here, as we provide links in the reference section below. Just note that DbC is a way to define constraints on function arguments, return values, and class properties. The key point is that DbC checks are performed during the development/validation phase only. In production phase, DbC checks are turned off.

So, the most important points are :

  • DbC constraints can be extremely detailed as performance is not a problem.
  • DbC checks must not handle checks that must always run, even in production. Validating user input, for instance, must remain out of DbC constraints.
  • The DbC and 'Test Driven Development' concepts are closely related, as DbC heavily relies on the quality of test coverage.

Examples

First, an example of a function defining input, inline and output constraints ('$>' means 'return value'). This example is adapted from the D language.

//===========================================================================
/**
* Compute area of a triangle
*
* This function computes the area of a triangle using Heron's formula.
*
* @param float $a Length of 1st side
* @requires ($a >= 0)
* @param float $b Length of 2nd side
* @requires ($b >= 0)
* @param float $c Length of 3rd side
* @requires ($c >= 0)
* @requires ($a <= ($b+$c))
* @requires ($b <= ($a+$c))
* @requires ($c <= ($a+$b))
*
* @return float The triangle area
* @ensures ($> >= 0)
*/
 
function triangleArea($a, $b, $c)
{
$halfPerimeter = ($a + $b + $c) / 2;
 
// @assert ($halfPerimeter >= 0)
 
return sqrt($halfPerimeter
	* ($halfPerimeter - $a)
	* ($halfPerimeter - $b)
	* ($halfPerimeter - $c));
}

Another example with a clone of str_replace() :

//===========================================================================
/**
* Replace all occurrences of the search string with the replacement string
*
* This function returns a string or an array with all occurrences of search
* in subject replaced with the given replace value.
*
* @param string|array(string) $search The value being searched for (aka needle)
* @param string|array(string) $replace The replacement value that replaces found search values
* @param string|array(string) $subject The string or array being searched and replaced on
* @param.out int $count The number of replacements performed
* @ensures ($count >= 0)
* @return string|array(string) A string or an array with the replaced values
*
* Ensure that returned value is the same type as input subject :
* @ensures (is_array($>)===is_array($subject))
*/
 
function str_replace($search, $replace, $subject, &$count=null)
{
...

Note that we didn't provide any constraint on $count input, as this parameter is used for output only.

Proposal

DbC typically defines three constraint types :

  • pre-conditions: checked when entering a function/method. Generally check that passed arguments are valid.
  • post-conditions: checked when a function/method exits. Used to check the return type/value and the returned type/value of arguments passed by reference.
  • class invariants: Constraints on class properties. In PHP, two subtypes exist : constraints on static properties and constraint on dynamic (instance) properties.

In this document, we propose a mechanism to implement these constraints in the PHP world.

Syntax

We propose to include the DbC directives in PHP comments. The function/method/class-related constraints will be included in phpdoc blocks (extending the phpdoc syntax), while inline assertions will be included in plain comments.

The benefits are :

  • As directives are exclusively contained in PHP comments, the source code remains executable on every past and future PHP interpreter (no compatibility break).
  • A lot of PHP code is already documented using phpdoc. So, unchanged code will already benefit from DbC.
  • phpDocumentor will easily take advantage of the extensions DbC is bringing to phpdoc syntax and will easily generate a more detailed documentation from this information. No BC break here as phpDocumentor ignores unknown keywords.
  • PHP IDEs already use phpdoc blocks. So, it will be easy for them to understand DbC constraints.

Side effects

As DbC, by nature, can be turned on and off, DbC checks must not modify anything in the environment.

While enforcing this is partially possible in theory, this implementation will leave it to the developer's responsibility, as most languages do.

Pre-conditions

These conditions are checked at the beginning of a function or method, after arguments have been received, but before starting executing the function body.

Pre-conditions are expressed in two forms : argument types, and explicit assertions. Argument types are used first and explicit assertions supplement argument types with additional conditions (like conditions between arguments).

Argument types are checked before explicit assertions, meaning that explicit assertions can assume correct types.

Argument types

Argument type syntax is an extension and formalization of pre-existing phpdoc argument types. phpdoc accepts almost any string as argument type. DbC applies a real meaning on these types, reusing the types commonly used in phpdoc blocks.

Argument types are not present in original DbC syntax (like Eiffel or D implementation). This is a PHP-specific addition to enhance simplicity and readability. Argument types are just shortcuts as they could be replaced by explicit assertions.

Readability is the key point here: just compare a type like 'string|array(string|integer)' with the PHP code to check the same !

Argument types are used to check :

  • arguments sent to a function
  • arguments passed by ref returned by the function
  • the function's return value
  • the type of class properties
Syntax

Argument types cannot contain whitespaces.

Here is a pseudo-grammar of argument types :

phpdoc-line = "*", "@param", compound-type, $<argument-name> [, free-text]

compound-type = type, { "|", type }

type = "integer"
	| "float"
	| "string"
	| array-type
	| "callable"
	| object-type
	| resource-type
	| "null"
	| "scalar"
	| "mixed"
	| "boolean"

array-type = "array"
	| "array(", compound-type, ")"

object-type = "object"
	| "object(", class-name, ")"

resource-type = "resource"
	| "resource(", resource-name ")"

Every types are detailed below.

DbC types vs zval types

Before detailing DbC types, here is a table showing the matches between zval types and DbC types:

Zval type
DbC type IS_NULL IS_LONG IS_DOUBLE IS_BOOL(1) IS_ARRAY IS_OBJECT IS_STRING IS_RESOURCE
integer No Yes (2) No No No (3) No
float No Yes Yes No No No (4) No
string No Yes Yes No No (6) Yes No
array No No No No Yes No No No
callable No No No No (5) No (5) No
object No No No No No Yes No No
resource No No No No No No No Yes
scalar No Yes Yes Yes No No Yes No
null Yes No No No No No No No
mixed Yes Yes Yes Yes Yes Yes Yes Yes
boolean No No No Yes No No No No

(1) IS_TRUE/IS_FALSE in PHP 7
(2) only if decimal part is null
(3) only if is_numeric(string) returns true and decimal part is null
(4) only if is_numeric(string) returns true
(5) only if is_callable(arg,true) returns true
(6) only if class defines a __toString() method

You may note that this much more restrictive that PHP native type juggling.

integer

An integer value, positive or negative.

Note: This type is NOT equivalent to is_int($arg), as is_int() only accepts the IS_LONG zval type.

Synonyms: 'int'

float

Any value that returns true through is_numeric().

Equivalent to 'is_numeric($arg)'.

Synonyms: 'numeric', 'number'

string

An entity that can be represented by a string. Numeric values are accepted as strings, as well as objects whose class defines a __toString() method.

array

A PHP array.

Complements: Can be followed by a 'compound-type', enclosed in parentheses. This defines the acceptable types of the array values. This definition can be nested.

Examples:

* @param array $arr ...
* @param string|array(string) $... # Matches a string or an array of strings
* @param array(array(string|integer)) $... # A 2-dimension array containing strings and int only
callable

A string or array returning true through 'is_callable($arg,true)'.

Please consult the is_callable() documentation for more details.

object

An instance object.

Synonyms: 'obj'

Complements: Can be followed by a class name, enclosed in parentheses. Match will occur if the object is of this class or has this class as one of its parents (equivalent to is_a()).

Examples:

* @param object $arg
* @param object(Exception) $e
* @param object(MongoClient)|null $conn
resource

A PHP resource.

Synonyms: 'rsrc'

Complements: Can be optionally followed by a resource type. A resource type is a string provided when defining a resource via zend_register_list_destructors_ex(). As we don't support whitespaces in argument types, whitespaces present in the original resource type must be replaced with an underscore character ('_').

The easiest way to display the string corresponding to a resource type is to display an existing resource using var_dump().

Examples:

* @param resource(OpenSSL_key) $...
* @param resource(pgsl_link) $...
scalar

Shortcut for 'integer|float|boolean|string'.

Equivalent to 'is_scalar()'.

null

This corresponds exactly to the IS_NULL zval type.

Equivalent to 'is_null($arg)'.

Note that a number with a 0 value does not match 'null'.

Synonyms: 'void' (mostly used for return type)

Examples:

* @param string|null $...
* @param resource(pgsl_link) $...
* @return null
mixed

Accepts any zval type & value (catch-all).

Synonyms: 'any'

boolean

A boolean value (true or false).

In PHP 7, IS_BOOL is replaced with IS_TRUE and IS_FALSE.

Equivalent to 'is_bool($arg)'.

Synonyms: 'bool'

Optional arguments

When an optional argument is not set by the caller, its input (and possibly output) types are not checked. This allows to set a default value which does not match the argument's declared input type.

Example :

/**
* ...
* @param int $flag ...
* ...
*/
 
function myFunc(..., $flag=null)
{
if (is_null($flag)) {
	// Here, we are sure that the parameter was not set by the caller, as
	// a null value sent by the caller would be refused by DbC input check.
	...

Input assertions

These conditions supplement argument types for more complex conditions. They are executed in the function scope before executing the function's body.

Syntax :

/**
* ...
* @requires <php-condition>
* ...

where <php-condition> is a PHP expression whose evaluation returns true or false.

These assertions can appear anywhere in the phpdoc block. They are executed in the same order as they appear in the doc block.

Inheritance

The DbC theory, in accordance with the LSP, states that a subclass can override pre-conditions only if it loosens them.

The logic we implement is in the spirit of the way we manage class constructors/destructors :

  • Function pre-conditions are checked. If the function does not define any pre-condition, no check is performed, even if a parent's method defines some.
  • A special pre-condition is introduced. The '@parent' pre-condition causes the engine to check the parent method's pre-conditions. No existing parent method or parent method not defining any pre-condition is not considered as an error. In this case, we just have nothing to check.
  • The special '@parent' pre-condition can appear anywhere in the list.

Post-conditions

Post-conditions are checked at function's exit. Like pre-conditions, they are executed in the function scope.

They are generally used to check the returned type and value, and arguments returned by ref.

When a function exits because an exception was thrown, the function's post-conditions are not checked, but class constraints are checked.

Returned type

Syntax:

* @return <compound-type> [free-text]

The syntax of <compound-type> is the same as argument types.

Examples:

* @return resource|null

// For a factory:

* @return object(MyClass)

Argument return type

This is the return type & value of the arguments passed by reference.

Syntax:

* @param.out <compound-type> $<arg-name> [free-text]

Note that an argument passed by reference can have a '@param' line to define its input type, a '@param.out' line to define its output type, none of them, or both. In the str_replace() example above, we don't define an input type for $count because it is undefined.

Output assertions

Syntax:

* @ensures <condition>

As with input assertions, <condition> is a PHP condition that will be executed in the function scope. The only addition is that the '$>' string will be replaced with the function's return value before evaluation.

As with pre-conditions, output types are checked before output assertions.

Inheritance

The inheritance rules are the same as the ones for pre-conditions.

Unlike the Eiffel or D implementations, parent post-conditions will be checked only if the child requires it using a '@ensures @parent' directive.

Class constraints

These constraints are called 'invariants' in the DbC litterature. The idea is that properties must always verify a set of 'invariant' conditions.

Class constraints take two forms : property types and class assertions.

Each property type is defined in its own docblock, just before the definition of its property and class assertions are defined in the class docblock (the block just before the class definition).

Note that we don't define a specific constraint type for static properties. They will be checked using the same syntax as dynamic properties.

Property types

Syntax:

/** @var <compound-type> [free-text] */

where <compound-type> follows the same syntax as argument types.

Class assertions

These are defined in class docblocks.

Syntax:

* @invariant <condition>

<condition> must use '$this->' to access dynamic properties and 'self::' to access static properties.

Execution

Property types are checked before class assertions.

This set of constraints is checked :

  • after the execution of the constructor, if it exists.
  • before destroying the object, even if no destructor exists.
  • before and after execution of a public dynamic method.

Class constraints are executed before pre-conditions and/or after post-conditions.

Scope

These constraints are executed in the class scope ('$this' and 'self' can be used).

Inheritance

The same mechanism is used as with pre/post-conditions. Parent constraints are checked only if explicitely called using '@invariant @parent'.

Inline assertions

Syntax

These assertions can appear anywhere PHP code is valid.

// @assert <condition>

Scope

<condition> is executed where the assertion appears.

Nested calls

When a function or method is called from a DbC condition, its constraints are not checked.

Constraint violations

When a DbC condition fails, an E_ERROR is raised, containing the file and line number of the failing condition.

Backward Incompatible Changes

None

Proposed PHP Version(s)

PHP 7. Backporting to PHP 5 is possible if implemented as a separate extension.

RFC Impact

To SAPIs

None

To Existing Extensions

<TODO> Compatibility with Xdebug and other debugging tools ?

To Opcache

None

New Constants

None

php.ini Defaults

dbc.enforce : boolean

  • php.ini-development value: true
  • php.ini-production value: false

Open Issues

Unaffected PHP Functionality

When DbC is turned off, there's no change in PHP behavior.

Future Scope

  • Implement static-only class constraints (to be called before and after executing a static or dynamic public method)
  • Checking exceptions: Using the '@throws' keyword, check that thrown exceptions correspond to declared types.
  • Extend type syntax (define a syntax for ranges, enums, etc)
  • Extend DbC to internal functions
  • Extend DbC to interfaces (internal and userland)

Proposed Voting Choices

Required majority ? To be defined.

Patches and Tests

Dmitry Stogov volunteered for implementation.

Not sure this should be implemented in the PHP core. A Zend extension would be probably better, if possible. An additional benefit, in this case, would be to add the feature to PHP 7 and PHP 5.

We only need 3 hooks :

  • When a script file is read, before any parsing starts (contents won't be modified)
  • When a function/method starts, after arguments are received, but before executing the function's body.
  • When a function returns, before the function scope is deleted (we need access to the function's variables, to the arguments, and to the return value).

I don't know yet if these hooks are available in the current PHP engine. If this is the case, this RFC is useless but getting people's opinion is always valuable.

Implementation

References

Rejected Features

rfc/dbc.1423274634.txt.gz · Last modified: 2017/09/22 13:28 (external edit)