rfc:dbc

This is an old revision of the document!


PHP RFC: Native Design by Contract support

Introduction

For more than 10 years (roughly since PHP 5 was released), the PHP core community has seen a lot of discussions about strict vs loose typing, type hinting and related features.

To summarize years of flame wars, developers argue that strict typing and type hinting will make their source code cleaner and easier to debug. On the other side, these features must remain optional and compatible with 'basic' loose-typed PHP syntax. And the debate generally dies in endless discussions about the concept of 'number', 'int', 'float'... :).

With this RFC, we propose an alternative approach, already present in several languages, named 'design by contract' (reduced to 'DbC' in the rest of the document).

We won't detail the concept of DbC here, as we provide links in the reference section below. Just note that DbC is a way to define constraints on function arguments, return values, and class properties. The key point is that DbC checks are performed during the development/validation phase only. In production phase, DbC checks are turned off.

So, the most important points are :

  • DbC constraints can be extremely detailed as performance is not a problem.
  • DbC checks cannot handle checks that must always run, even in production. Validating user input, for instance, must remain out of DbC constraints.
  • The DbC and 'Test Driven Development' concepts are closely related, as DbC heavily relies on the quality of test coverage.

Example

First, an example of a function defining input, inline and output constraints ('$>' means 'return value'):

//===========================================================================
/**
* Compute area of a triangle
*
* This function computes the area of a triangle using Heron's formula.
*
* @param float $a Length of 1st side
* @assert.in ($a >= 0)
* @param float $b Length of 2nd side
* @assert.in ($b >= 0)
* @param float $c Length of 3rd side
* @assert.in ($c >= 0)
* @assert.in ($a <= ($b+$c))
* @assert.in ($b <= ($a+$c))
* @assert.in ($c <= ($a+$b))
*
* @return float The triangle area
* @assert.out ($> >= 0)
*/
 
function triangleArea($a, $b, $c)
{
$halfPerimeter = ($a + $b + $c) / 2;
 
// @assert ($halfPerimeter >= 0)
 
return sqrt($halfPerimeter
	* ($halfPerimeter - $a)
	* ($halfPerimeter - $b)
	* ($halfPerimeter - $c));
}

Another example with a clone of str_replace() :

//===========================================================================
/**
* Replace all occurrences of the search string with the replacement string
*
* This function returns a string or an array with all occurrences of search
* in subject replaced with the given replace value.
*
* @param string|array(string) $search The value being searched for (aka needle)
* @param string|array(string) $replace The replacement value that replaces found search values
* @param string|array(string) $subject The string or array being searched and replaced on
* @param.out int $count The number of replacements performed
* @assert.out ($count >= 0)
* @return string|array(string) A string or an array with the replaced values
*
* Ensure that returned value is the same type as input subject :
* @assert.out (is_array($>)===is_array($subject))
*/
 
function str_replace($search, $replace, $subject, &$count=null)
{
...

Note that we didn't provide any constraint on $count input, as this parameter is used for output only.

Proposal

DbC typically defines three constraint types :

  • pre-conditions: checked when entering a function/method. Generally check that passed arguments are valid.
  • post-conditions: checked when a function/method exits. Used to check the return type/value and the returned type/value of arguments passed by reference.
  • class invariants: Constraints on class properties. In PHP, two subtypes exist : constraints on static properties and constraint on dynamic (instance) properties.

Syntax

We propose to include the DbC directives in PHP comments. The function/method/class-related constraints will be included in phpdoc blocks (extending the phpdoc syntax), while inline assertions will be included in plain comments.

The benefits are :

  • As directives are exclusively contained in PHP comments, the source code remains executable on every past and future PHP interpreter (no compatibility break).
  • phpdoc blocks already contain arguments and return types. DbC will use this information. So, unchanged code will already benefit from DbC.
  • phpDocumentor will easily take advantage of the extensions DbC is bringing to the phpdoc syntax and will easily generate a more detailed documentation. There is no BC break here as, even using the current version of phpDocumentor, DbC-specific keywords are ignored and the documentation is correctly generated.
  • PHP IDEs already use phpdoc blocks. So, it will be easier for them to understand DbC constraints.

Pre-conditions

These conditions are checked at the beginning of a function or method, after arguments have been received, but before starting executing the function body.

The pre-conditions are expressed in two forms : argument types, and explicit assertions. Argument types are used first and explicit assertions supplement argument types with additional conditions (like conditions between arguments).

Argument types are checked before explicit assertions, meaning that explicit assertions can assume correct types.

Argument types

Argument type syntax is an extension and formalization of pre-existing phpdoc argument types. phpdoc accepts almost any string as argument type. DbC applies a real meaning on these types, reusing the types commonly used in phpdoc blocks.

Argument types are not present in original DbC syntax (like Eiffel or D implementation). This is a PHP-specific addition to enhance simplicity and readability. Argument types are just shortcuts as they could be replaced by explicit assertions.

Readability is the key point here: just compare a type like 'string|array(string|integer)' with the PHP code to check the same !

Argument types are used to check :

  • arguments sent to a function
  • arguments passed by ref returned by the function
  • the function's return value
Syntax

Argument types cannot contain whitespaces.

Here is a pseudo-grammar of argument types :

phpdoc-line = "*", "@param", compound-type, $<argument-name> [, free-text]

compound-type = type, { "|", type }

type = "integer"
	| "float"
	| "string"
	| array-type
	| "callable"
	| object-type
	| resource-type
	| "null"
	| "scalar"
	| "mixed"
	| "boolean"

array-type = "array"
	| "array(", compound-type, ")"

object-type = "object"
	| "object(", class-name, ")"

resource-type = "resource"
	| "resource(", resource-name ")"

Every types are detailed below.

DbC types vs zval types

Before detailing DbC types, here is a table showing the matches between zval types and DbC types:

Zval type
DbC type IS_NULL IS_LONG IS_DOUBLE IS_BOOL(1) IS_ARRAY IS_OBJECT IS_STRING IS_RESOURCE
integer No Yes (2) No No No (3) No
float No Yes Yes No No No (4) No
string No Yes Yes No No (6) Yes No
array No No No No Yes No No No
callable No No No No (5) No (5) No
object No No No No No Yes No No
resource No No No No No No No Yes
scalar No Yes Yes Yes No No Yes No
null Yes No No No No No No No
mixed Yes Yes Yes Yes Yes Yes Yes Yes
boolean No No No Yes No No No No

(1) IS_TRUE/IS_FALSE in PHP 7
(2) only if decimal part is null
(3) only is_numeric(string) returns true and decimal part is null
(4) only is_numeric(string) returns true
(5) See below for conditions to match 'callable'
(6) only if class defines a \_\_toString() method

You may note that this much more restrictive that PHP native type juggling.

integer

An integer value, positive or negative.

Note: This type is NOT equivalent to is_int($arg), as is_int() only accepts the IS_LONG zval type.

Synonyms: 'int'

float

Any value that returns true through is_numeric().

Equivalent to 'is_numeric($arg)'.

Synonyms: 'numeric', 'num'

string

An entity that can be represented by a string. Numeric values are accepted as strings, like objects whose class defines a __toString() method.

array

A PHP array.

Synonyms: 'arr'

Complements: Can be followed by a 'compound-type', enclosed in parentheses. This defines the acceptable types of the array values. This definition can be nested.

Examples:

* @param array $arr ...
* @param string|array(string) $... # Matches a string or an array of strings
* @param array(array(string|integer)) $... # A 2-dimension array containing strings and int only
callable

A string or array considered as 'callable'. For more information, refer to the documentation of the 'is_callable() PHP function.

Equivalent to 'is_callable($arg,true)'.

object

An instance object.

Synonyms: 'obj'

Complements: Can be followed by a class name, enclosed in parentheses. Match will occur if the object is of this class or has this class as one of its parents (equivalent to is_a()).

Examples:

* @param object $arg
* @param object(Exception) $e
* @param object(MongoClient)|null $conn
resource

A PHP resource.

Synonyms: 'rsrc'

Complements: Can be optionally followed by a resource type. A resource type is a string provided when defining a resource via zend_register_list_destructors_ex(). As we don't support whitespaces in argument types, whitespaces present in the original resource type must be replaced with an underscore character ('_').

The easiest way to display the string corresponding to a resource type is to display an existing resource using var_dump().

Examples:

* @param resource(OpenSSL_key) $...
* @param resource(pgsl_link) $...
scalar

Shortcut for 'integer|float|boolean|string'.

Equivalent to 'is_scalar()'.

null

This corresponds exactly to the IS_NULL zval type.

Equivalent to 'is_null($arg)'.

Note that a number with a 0 value does not match 'null'.

Synonyms: 'void' (mostly used for return type)

Examples:

* @param string|null $...
* @param resource(pgsl_link) $...
* @return null
mixed

Accepts every zval type & value (catch-all).

Synonyms: 'any'

Complements: None

boolean

A boolean value (true or false).

In PHP 7, IS_BOOL is replaced with IS_TRUE and IS_FALSE.

Equivalent to 'is_bool($arg)'.

Synonyms: 'bool'

Complements: None

Optional arguments

When an optional argument is not set by the caller, its input/output types are not checked. This allows to set a default value which does not match the argument's declared input type.

Example :

/**
* ...
* @param int $flag ...
* ...
*/
 
function myFunc(..., $flag=null)
{
if (is_null($flag)) {
	// Here, we are sure that the parameter was not set by the caller, as
	// a null value sent by the caller would be refused by DbC input check.
	...

Input assertions

line = "*", "@assert.in", php-condition

Inheritance

Post-conditions

Returned type

line = “*”, “@return”, compound-type [, free-text]

Argument return type

This is the return type & value of the arguments passed by reference.

@param.out <compound-type> <name> <free-text>

Output assertions

@assert.out

Inheritance

Class-wide constraints

Static constraints

Syntax

@assert.static

Execution
Scope
Inheritance

Instance constraints

@assert.instance

Execution
Scope
Inheritance

Inline assertions

Syntax

// @assert <condition>

Scope

Exception thrown

ContractException

Backward Incompatible Changes

None

Proposed PHP Version(s)

PHP 7

RFC Impact

To SAPIs

None

To Existing Extensions

Compatibility with Xdebug ? <TODO>

To Opcache

<TODO>

New Constants

None

php.ini Defaults

dbc.enforce : boolean

  • php.ini-development value: true
  • php.ini-production value: false

Open Issues

Unaffected PHP Functionality

When DbC is off, there's no change in PHP behavior.

Future Scope

  • Exceptions: Using the '@throws' keyword, we can raise an error if the function throws an exception of an undeclared class.
  • Enforce phpdoc's @var property types. Static and dynamic propertiy checks would be added to invariant checks.
  • Extend type syntax (define a syntax for ranges, enums, etc)

Proposed Voting Choices

Required majority ?

Patches and Tests

Dmitry Stogov volunteered for implementation.

Not sure this should be implemented in the PHP core. A Zend extension would be probably better, if possible. An additional benefit, in this case, would be to add the feature to PHP 7 and PHP 5.

We only need 3 hooks :

  • When a script file is read, before any parsing starts
  • When a function/method starts, after arguments are received, but before executing the function's body.
  • When a function returns, before the function scope is deleted. And we need access to the return value.

I don't know yet if these hooks are available in the current engine. If implementation does not require any change in the PHP engine, this RFC is useless but getting people's opinion is always valuable.

Implementation

<TODO>

References

Rejected Features

rfc/dbc.1423245698.txt.gz · Last modified: 2017/09/22 13:28 (external edit)