rfc:pure-intersection-types

PHP RFC: Pure intersection types

Introduction

An “intersection type” requires a value to satisfy multiple type constraints instead of a single one.

Intersection types are currently not supported natively by the language. Instead, one must either use phpdoc annotations, and/or abuse typed properties [1] as can be seen in the following example:

class Test {
    private ?Traversable $traversable = null;
    private ?Countable $countable = null;
    /** @var Traversable&Countable */
    private $both = null;
 
    public function __construct($countableIterator) {
        $this->traversable =& $this->both;
        $this->countable =& $this->both;
        $this->both = $countableIterator;
    }
}

Supporting intersection types in the language allows us to move more type information from phpdoc into function signatures, with the usual advantages this brings:

  • Types are actually enforced, so mistakes can be caught early.
  • Because they are enforced, type information is less likely to become outdated or miss edge-cases.
  • Types are checked during inheritance, enforcing the Liskov Substitution Principle.
  • Types are available through Reflection.
  • The syntax is a lot less boilerplate-y than phpdoc.

Motivation

It is possible to emulate intersection types by creating a new interface which inherits from multiple ones, one such case is the built in SeekableIterator which extends the Iterator interface by adding a seek() method on it. However, an iterator can also be countable, an if a function needs to type against such a requirement the only possible way is to currently create a new interface:

interface CountableIterator extends Iterator, Countable {}

This works, but what if we want an iterator that is countable and seekable? We need to create another interface:

interface SeekableCountableIterator extends CountableIterator, SeekableIterator {}

As such, each new requirement necessitates the creation of various new interfaces taking into account all possible combinations.

Moreover, the class needs to implement the specific interface and cannot rely on just implementing the base interfaces, meaning the introduction of such interfaces need to be propagated to all relevant classes, something which can be error prone. See this non-example:

interface A {}
interface B {}
interface AB extends A, B {}
 
class Test implements A, B {}
 
function foo(AB $v) {
    var_dump($v);
}
 
foo(new Test());

Intersection types solve these issues.

Proposal

Add support for pure intersection types are specified using the syntax T1&T2&... and can be used in all positions where types are currently accepted:

class A {
    private Traversable&Countable $countableIterator;
 
    public function setIterator(Traversable&Countable $countableIterator): void {
        $this->countableIterator = $countableIterator;
    }
 
    public function getIterator(): Traversable&Countable {
        return $this->countableIterator;
    }
}

This means it would not be possible to mix intersection and union types together such as A&B|C, this is left as a future scope.

Supported types

Only class types (interfaces and class names) are supported by intersection types.

The rationale is that for nearly all standard types using them in an intersection type result in a type which can never be satisfied (e.g. int&string).

Usage of mixed in an intersection type is redundant as mixed&T corresponds to T, as such this is disallowed.

Similarly using iterable in an intersection results in a redundant invalid type, this can be seen by expanding the type expression iterable&T = (array|Traversable)&T = (array&T) | (Traversable&T) = Traversable&T

Although an intersection with callable can make sense (e.g. string&callable), we think it is unwise and points to a bug.

Similarly parent, self, and static are technically feasible and could be used as part of an intersection, but impose strange restrictions on a child class which the base class violates or the base class already satisfies the type requirements in which case it is redundant. Therefore those 3 types are also forbidden because they likely point to a design issue.

Duplicate and redundant types

To catch some simple bugs in intersection type declarations, redundant types that can be detected without performing class loading will result in a compile-time error. This includes:

  • Each name-resolved type may only occur once. Types like A&B&A result in an error.

This does not guarantee that the type is “minimal”, because doing so would require loading all used class types.

For example, if A and B are runtime class aliases, then A&B remains a legal intersection type, even though it could be reduced to either A or B. Similarly, if class B extends A {}, then A&B is also a legal intersection type, even though it could be reduced to just B.

function foo(): A&A {} // Disallowed
 
use A as B;
function foo(): A&B {} // Disallowed ("use" is part of name resolution)
 
class_alias('X', 'Y');
function foo(): X&Y {} // Allowed (redundancy is only known at runtime)

Type grammar

Due to a parser ambiguity with the declaration of by-ref parameter while using the current LR(1) parser, the grammar and lexer are modified to create different tokens for the & character depending if it is followed by a (variadic) variable or not.

The grammar thus looks as following:

type_expr:
        type    
    |   '?' type    
    |   union_type
    |   intersection_type
;


intersection_type:
        type T_AMPERSAND_NOT_FOLLOWED_BY_VAR_OR_VARARG type
    |   intersection_type T_AMPERSAND_NOT_FOLLOWED_BY_VAR_OR_VARARG type
;

Variance

Intersection types follow standard PHP variance rules that are already used for inheritance and type checking:

  • Return types are covariant (child must be subtype).
  • Parameter types are contravariant (child must be supertype).
  • Property types are invariant (child must be subtype and supertype).

The only change is in how intersection types interact with subtyping, with two additional rules:

  • A is a subtype of B_1&...&B_n if for all B_i, A is a subtype of B_i
  • A_1&...&A_n is a subtype of B if there exists an A_i such that A_i is a subtype of B

In the following, some examples of what is allowed and what isn't are given.

Property types

Property types are invariant, which means that types must stay the same during inheritance. However, the “same” type may be expressed in different ways.

Intersection types expand the possibilities in this area: For example A&B and B&A represent the same type. The following example shows a more complex case:

class A {}
class B extends A {}
 
class Test {
    public A&B $prop;
}
class Test2 extends Test {
    public B $prop;
}

In this example, the intersection A&B actually represents the same type as just B, and this inheritance is legal, despite the type not being syntactically the same.

Formally, we arrive at this result as follows: First, the parent type A&B is a subtype of B. Second, B is a subtype of A&B, because B is a subtype of A and B is a subtype of B.

Adding and removing intersection types

It is legal to add intersection types in return position and remove intersection types in parameter position:

class A {}
interface X {}
 
class Test {
    public function param1(A $param) {}
    public function param2(A&X $param) {}
 
    public function return1(): A&X {}
    public function return2(): A {}
}
 
class Test2 extends Test {
    public function param1(A&X $param) {}   // FORBIDDEN: Adding extra param type constraint
    public function param2(A $param) {}     // Allowed: Removing param type constraint
 
    public function return1(): A {}         // FORBIDDEN: Removing return type constraint
    public function return2(): A&X {}       // Allowed: Adding extra return type constraint
}

Variance of individual intersection members

Similarly, it is possible to restrict an intersection member in return position, or widen an intersection member in parameter position:

class A {}
class B extends A {}
interface X {}
 
class Test {
    public function param1(B&X $param) {}
    public function param2(A&X $param) {}
 
    public function return1(): A&X {}
    public function return2(): B&X {}
}
 
class Test2 extends Test {
    public function param1(A&X $param) {} // Allowed: Widening intersection member B -> A
    public function param2(B&X $param) {} // FORBIDDEN: Restricting intersection member A -> B
 
    public function return1(): B&X {}     // Allowed: Restricting intersection member A -> B
    public function return2(): A&X {}     // FORBIDDEN: Widening intersection member B -> A
}

Of course, the same can also be done with multiple intersection members at a time, and be combined with the addition/removal of types mentioned previously.

Variance of intersection type to concrete class type

As the primary use of intersection types is to ensure multiple interfaces are implemented, a concrete class or interface which implements all the interfaces present in the intersection is considered a subtype and thus can be used where co-variance is allowed.

interface X {}
interface Y {}
 
class TestOne implements X, Y {}
 
interface A
{
    public function foo(): X&Y;
}
 
 
interface B extends A
{
    public function foo(): TestOne;
}

Moreover, it is possible to use a union type of concrete classes/interface when each of the member of the union implement all of the interfaces in the intersection.

class TestTwo implements X, Y {}
 
interface C extends A
{
    public function foo(X&Y $param): TestOne|TestTwo;
}

The reason why this is possible is that a union of concrete classes/interfaces is less general then the set of possible classes which satisfy the intersection type.

Coercive typing mode

As standard types are not allowed in pure intersection types, no consideration for the coercive typing mode needs to done.

Property types and references

References to typed properties with intersection types follow the semantics outlined in the typed properties RFC:

If typed properties are part of the reference set, then the value is checked against each property type. If a type check fails, a TypeError is generated and the value of the reference remains unchanged.
interface X {}
interface Y {}
interface Z {}
 
class A implements X, Y, Z {}
class B implements X, Y {}
 
class Test {
    public X&Y $y;
    public X&Z $z;
}
$test = new Test;
$r = new A;
$test->y =& $r;
$test->z =& $r;
 
// Reference set: { $r, $test->y, $test->z }
// Types: { A, X&Y, X&Z }
 
$r = new B;
// TypeError: Cannot assign B to reference held by property Test::$z of type X&Z

Reflection

To support intersection types, a new class ReflectionIntersectionType is added:

class ReflectionIntersectionType extends ReflectionType {
    /** @return ReflectionType[] */
    public function getTypes();
 
    /* Inherited from ReflectionType */
    /** @return bool */
    public function allowsNull();
 
    /* Inherited from ReflectionType */
    /** @return string */
    public function __toString();
}

The getTypes() method returns an array of ReflectionTypes that are part of the intersection. The types may be returned in an arbitrary order that does not match the original type declaration. The types may also be subject to equivalence transformations.

For example, the type X&Y may return types in the order ["Y", "X"] instead. The only requirement on the Reflection API is that the ultimately represented type is equivalent.

The __toString() method returns a string representation of the type that constitutes a valid code representation of the type in a non-namespaced context. It is not necessarily the same as what was used in the original code.

Examples

// This is one possible output, getTypes() and __toString() could
// also provide the types in the reverse order instead.
function test(): A&B {}
$rt = (new ReflectionFunction('test'))->getReturnType();
var_dump(get_class($rt));    // "ReflectionIntersectionType"
var_dump($rt->allowsNull()); // false
var_dump($rt->getTypes());   // [ReflectionType("A"), ReflectionType("B")]
var_dump((string) $rt);      // "A&B"
 
function test2(): A&B&C {}
$rt = (new ReflectionFunction('test2'))->getReturnType();
var_dump(get_class($rt));    // "ReflectionIntersectionType"
var_dump($rt->allowsNull()); // false
var_dump($rt->getTypes());   // [ReflectionType("A"), ReflectionType("B"),
                             //  ReflectionType("C")]
var_dump((string) $rt); // "A&B&C"

Backward Incompatible Changes

This RFC does not contain any backwards incompatible changes.

However, existing ReflectionType based code might need to be adjusted in order to support processing of code that uses intersection types.

Proposed PHP Version

Next minor version, i.e. PHP 8.1.

Future Scope

The features discussed in the following are not part of this proposal.

Composite types (i.e. mixing union and intersection types)

While early prototyping [2] shows that supporting A&B|C without any grouping looks feasible, there are still many other considerations (e.g. Reflection), but namely the variance rules and checks, which would be dramatically increased and prone to error.

There is also the opinion that composite types should not rely on precedence of unions but be explicitly grouped together.

As such we consider a stepped approach by only allowing pure intersection first the best way forward.

Type Aliases

As types become increasingly complex, it may be worthwhile to allow reusing type declarations. There are two general ways in which this could work. One is a local alias, such as:

use Traversable&Countable as CountableIterator;
 
function foo(CountableIterator $x) {}

In this case CountableIterator is a symbol that is only visible locally and will be resolved to the original Traversable&Countable type during compilation.

The second possibility is an exported typedef:

namespace Foo;
type CountableIterator = Traversable&Countable;
 
// Usable as \Foo\CountableIterator from elsewhere

It should be noted that inclusion of this proposal will add extra considerations for type aliases as it would be possible to write composite types as if grouping was supported. However, the groundwork for supporting this is present in this proposal.

Proposed Voting Choices

As per the voting RFC a yes/no vote with a 2/3 majority is needed for this proposal to be accepted.

Add pure intersection types to PHP
Real name Yes No
alcaeus (alcaeus)  
alec (alec)  
asgrim (asgrim)  
ashnazg (ashnazg)  
bmajdak (bmajdak)  
bwoebi (bwoebi)  
carusogabriel (carusogabriel)  
colinodell (colinodell)  
crell (crell)  
cschneid (cschneid)  
didou (didou)  
dmitry (dmitry)  
galvao (galvao)  
girgias (girgias)  
jasny (jasny)  
kalle (kalle)  
kguest (kguest)  
kocsismate (kocsismate)  
lcobucci (lcobucci)  
levim (levim)  
mcmic (mcmic)  
ocramius (ocramius)  
pollita (pollita)  
ramsey (ramsey)  
reywob (reywob)  
santiagolizardo (santiagolizardo)  
seld (seld)  
sergey (sergey)  
sirsnyder (sirsnyder)  
tandre (tandre)  
trowski (trowski)  
weierophinney (weierophinney)  
wyrihaximus (wyrihaximus)  
Final result: 30 3
This poll has been closed.

Implementation

Acknowledgements

To Ilija Tovilo for resolving the parser conflict with by-ref parameters.

To Nikita Popov for reviewing and refactoring the variance code.

References

[1]: Slide 14 of Nikita Popov's talk “Typed Properties and more: What's coming in PHP 7.4?” https://image.slidesharecdn.com/presentationnikita-190519190251/95/typed-properties-and-more-whats-coming-in-php-74-14-638.jpg?cb=1558292620
[2]: Git PR with basic prototype for mixing intersection and union types https://github.com/Girgias/php-src/pull/8

rfc/pure-intersection-types.txt · Last modified: 2021/07/05 12:17 by girgias