rfc:pattern-matching

This is an old revision of the document!


PHP RFC: Pattern Matching

Introduction

This RFC introduces the beginning of a pattern matching syntax for PHP. It does not include complete matching of all possible pattern types in order to keep the initial implementation simple and reduce bikeshedding, but does lay out the mechanism by which pattern matching operates. The Future Scope section includes recommendations for continued improvement in future RFCs.

Pattern Matching as a language concept contains two parts: Matching a variable against a potentially complex data structure pattern, and optionally extracting values out of that variable into their own variables. In a sense it serves a similar purpose for complex data structures as regular expressions do for strings. When properly applied, it can lead to very compact but still readable code, especially when combined with conditional structures such as match().

Pattern matching is found in a number of languages, including Haskell, C#, ML, Rust, and Swift, among others. The syntax offered here is inspired primarily by C#, but is not intended as as direct port.

This RFC is part of the Enumerations Epic. It is a stepping stone toward full Enumerations but stands on its own as useful functionality.

Proposal

This RFC introduces a new keyword and binary operator: is. The is keyword indicates that its right hand side is a pattern against which its left hand side should be applied. The is operator is technically a comparison operator, and always returns a boolean true or false.

if($var is <pattern>) {

}

The left-hand side of is will be evaluated first until it is reduced to a single value (which could be an arbitrarily complex object or array). That value will then be compared to the pattern, and true or false returned.

While patterns may resemble other language constructs, whatever follows is is a pattern, not some other instruction.

is may be used in any context in which a boolean result is permissible. That includes variable assignment, if conditions, while conditions, match() statements, etc.

Supported patterns

Type patterns

A pattern may be a type signature, including both class and primitive types as well as compound types. In this case, is will match the left hand side value against the specified type. That is, the following are all legal:

$foo is string;    // Equivalent to is_string($foo)
$foo is int|float; // Equivalent to is_int($foo) || is_float($foo)
$foo is Request;   // Equivalent to $foo instanceof Request
$foo is User|int;  // Equivalent to $foo instanceof User || is_int($foo)
$foo is ?array;    // Equivalent to is_array($foo) || is_null($foo)

A type match may be any syntax supported by a parameter type; in a sense, $foo is pattern is equivalent to “would $foo pass a type check if passed to a parameter with this type specification.” As more complex type checks become allowed (such as intersection types, type aliases, etc.) they will become valid in a pattern as well.

Literal patterns

Any literal may be a pattern. This is a degenerate case and not generally useful, but is included for consistency when used with match() (see below).

$foo is 5;         // Equivalent to $foo === 5
$foo is 'yay PHP'; // Equivalent to $foo === 'yay PHP'

Global constants are NOT permitted in a pattern. They cannot be disambiguated from a class name, and are of minimal if any use in practice.

''match()'' enhancement

Pattern matching is frequently used in conjunction with branching structures, in particular with enumerations. To that end, this RFC also enhances the match() structure. Specifically, if the is keyword is used in match() then match() will perform a pattern match rather than an identity comparison.

That is, this code:

$result = match ($somevar) is {
    Foo => 'foo',
    Bar => 'bar',
    Baz|Beep => 'baz',
};

is equivalent to the following:

$result = match (true) {
    $somevar is Foo => 'foo',
    $somevar is Bar => 'bar',
    $somevar is Baz|Beep => 'baz',
};

Backward Incompatible Changes

A new keyword is added, is. That precludes global constants named is.

No other BC breaks are expected.

Proposed PHP Version(s)

PHP 8.next (aka 8.1).

RFC Impact

Open Issues

Do any other patterns need to be included in the initial RFC?

Future Scope

Numerous other, more robust (and complex) patterns can be supported in the future. This RFC keeps to the MVP implementation and most common cases. The following additional patterns are possible future additions for other RFCs. (Please don't bikeshed them here; they are shown as an example of where pattern matching can extend to in the future.)

Array structure pattern

$arr is ['a' => 'A', 'b' => $b];
 
// Equivalent to:
is_array($arr) && $arr['a'] === 'A' && $arr['b'] === $b);

Range pattern

$foo is 0..=10;
 
// Equivalent to:
$foo >=0 && $anInt <= 10;
 
$foo is 0..<10;
 
// Equivalent to:
$foo >=0 && $anInt < 10;
 
$foo is >10;
 
// Equivalent to:
$foo is $foo > 10;

Boolean pattern combination

$foo is 1 or 2;
 
// Equivalent to:
$foo === 1 || $foo === 2;
 
$foo is User or 1..=5;
 
// Equivalent to:
$foo instanceof User || ($foo >= 0 && $foo <= 5);

Regex pattern

$foo is /^http:\/\/%$domain/
 
// Equivalent to:
$matches = [];
preg_match('/^http:\/\/%$domain/', $foo, $matches);
$domain == $matches[0];

Proposed Voting Choices

This is a simple up-or-down vote, requiring 2/3 Yes to pass.

Patches and Tests

Links to any external patches and tests go here.

If there is no patch, make it clear who will create a patch, or whether a volunteer to help with implementation is needed.

Make it clear if the patch is intended to be the final patch, or is just a prototype.

For changes affecting the core language, you should also provide a patch for the language specification.

Implementation

After the project is implemented, this section should contain

  1. the version(s) it was merged into
  2. a link to the git commit(s)
  3. a link to the PHP manual entry for the feature
  4. a link to the language specification section (if any)

References

Links to external references, discussions or RFCs

Rejected Features

Keep this updated with features that were discussed on the mail lists.

rfc/pattern-matching.1605138324.txt.gz · Last modified: 2020/11/11 23:45 by crell