This is an old revision of the document!
PHP RFC: Match expression
- Date: 2020-04-12
- Author: Ilija Tovilo, tovilo.ilija@gmail.com
- Status: Under Discussion
- Target Version: PHP 8.0
- Implementation: https://github.com/php/php-src/pull/5371
- Supersedes: https://wiki.php.net/rfc/switch_expression
Proposal
The switch
statement is a fundamental control structure in almost every programming language. Unfortunately, in PHP it has some long-standing issues that make it hard to use correctly, namely:
- Type coercion
- No return value
- Fallthrough
- Inexhaustiveness
This RFC proposes a new control structure called match
to resolve these issues.
match ($condition) { 1 => { foo(); bar(); }, 2 => baz(), } $expressionResult = match ($condition) { 1, 2 => foo(), 3, 4 => bar(), default => baz(), };
Issues
We're going to take a look at each issue and how the new match
expression resolves them.
Type coercion
The switch
statement loosely compares (==
) the given value to the case values. This can lead to some very surprising results.
switch ('foo') { case 0: echo "Oh no!\n"; break; }
The match
expression uses strict comparison (===
) instead. The comparison is strict regardless of strict_types
.
match ('foo') { 0 => { echo "Never reached\n"; }, }
No return value
It is very common that the switch
produces some value that is used afterwards.
switch (1) { case 0: $y = 'Foo'; break; case 1: $y = 'Bar'; break; case 2: $y = 'Baz'; break; } echo $y; //> Bar
It is easy to forget assigning $y
in one of the cases. It is also visually unintuitive to find $y
declared in a deeper nested scope. match
is an expression that evaluates to the result of the executed arm. This removes a lot of boilerplate and makes it impossible to forget assigning a value in an arm.
echo match (1) { 0 => 'Foo', 1 => 'Bar', 2 => 'Baz', }; //> Bar
Fallthrough
The switch
fallthrough has been a large source of bugs in many languages. Each case
must explicitely break
out of the switch
statement or the execution will continue into the next case
even if the condition is not met.
switch ($pressedKey) { case Key::ENTER: save(); // Oops, forgot the break case Key::DELETE: delete(); break; }
This was intended to be a feature so that multiple conditions can execute the same block of code. It is often hard to understand if the missing break
was the authors intention or a mistake.
switch ($x) { case 1: case 2: // Same for 1 and 2 break; case 3: // Only 3 case 4: // Same for 3 and 4 break; }
The match
expression resolves this problem by adding an implicit break
after every arm. Multiple conditions can be comma-separated to execute the same block of code. There's no way to achieve the same result as 3 and 4 in the example above without an additional if
statement. This is a little bit more verbose but makes the intention very obvious.
match ($x) { 1, 2 => { // Same for 1 and 2 }, 3, 4 => { if ($x === 3) { // Only 3 } // Same for 3 and 4 }, }
Inexhaustiveness
Another large source of bugs is not handling all the possible cases supplied to the switch
statement.
switch ($configuration) { case Config::FOO: // ... break; case Config::BAR: // ... break; }
This will go unnoticed until the program crashes in a weird way, causes strange behavior or even worse becomes a security hole. Many languages can check if all the cases are handled at compile time or force you to write a default
case if they can't. For a dynamic language like PHP the only alternative is throwing an error. This is exactly what the match
expression does. It throws an UnhandledMatchError
if the condition isn't met for any of the arms.
match ($x) { 1 => ..., 2 => ..., } // $x can never be 3
Blocks
Sometimes passing a single expression to a match arm isn't enough, either because you need to use a statement or the code is just too long for a single expression. In those cases you can pass a block to the arm that can contain a list of statements.
match ($x) { 0 => { foo(); bar(); baz(); }, }
It is not possible to return a value from the block. Thus blocks are only allowed when using the match
expression as a statement. The following code will lead to a compilation error.
$x = match ($x) { 0 => { /* How do I return a value? */ }, }; //> Fatal error: Match expressions that utilize the result value can't contain blocks // The same goes for foo(match ($x) { 0 => { ... }, }); echo 1 + match ($x) { 0 => { ... }, }; // etc.
Semicolon
When using match
as part of some other expression it is necessary to terminate the statement with a semicolon.
$x = match ($y) { ... };
The same would usually be true if the match
expression were used as a standalone expression.
match ($y) { ... };
However, to make the match
expression more similar to other statements like if
and switch
it is allowed to drop the semicolon in this case only.
match ($y) { ... }
This introduces some ambiguities with prefix operators that are also binary operators, namely +
and -
.
match ($y) { ... } -1; // Could be parsed as // 1 match ($y) { ... }; -1; // 2 match ($y) { ... } - 1;
When match
appears as the first element of a statement it will always be parsed as option 1.
break/continue
Just like with the switch you can use break
to break out of the executed arm.
match ($x) { $y => { if ($condition) { break; } // Not executed if $condition is true }, }
As with the switch continue
is an alias to break
and will trigger a warning.
return
return
behaves the same as in any other context. It will return from the function.
function foo($x) { match ($x) { 1 => { return; }, } // Not executed if $x is 1 }
Pattern matching
I have experimented with pattern matching 1) for this RFC. Realistically it could sometimes save a few keystrokes. In my opinion this does not justify the significant complexity added to the langage at the moment. It would be mostly useful for algebraic data types which PHP currently does not have.
// With pattern matching match ($value) { let $a => ..., // Identifer pattern let 0..<10 => ..., // Range pattern let is string => ..., // Type pattern let [1, 2, $c] => ..., // Array pattern let Foo { foo: 1, getBar(): 2 } => ..., // Object pattern let $str @ is string if $str !== '' => ..., // Guard } // Without pattern matching match (true) { true => $value ..., // Identifer pattern $value >= 0 && $value < 10 => ..., // Range pattern is_string($value) => ..., // Type pattern count($value) === 3 && isset($value[0]) && $value[0] === 1 && isset($value[1]) && $value[1] === 2 && isset($value[2]) => $value[2] ..., // Array pattern $value instanceof Foo && $value->foo === 1 && $value->getBar() === 2 => ..., // Object pattern is_string($str) && $str !== '' => ..., // Guard }
While some patterns are significantly shorter (namely the array pattern) code like that is relatively rare. At the moment the arugment for such a big language change is pretty weak. If the situation ever changes we can always add pattern matching at a later point in time.
"Why don't you just use x"
if statements
if ($x === 1) { $y = ...; } elseif ($x === 2) { $y = ...; } elseif ($x === 3) { $y = ...; }
Needless to say this is incredibly verbose and there's a lot of repetition. It also can't make use of the jumptable optimization. You must also not forget to write an else
statement to catch unexpected values.
Hash maps
$y = [ 1 => ..., 2 => ..., ][$x];
This code will execute every single “arm”, not just the one that is finally returned. It will also build a hash map in memory everytime it is executed. And again, you must not forget to handle unexpected values.
Nested ternary operators
$y = $x === 1 ? ... : ($x === 2 ? ... : ($x === 3 ? ... : 0));
The parentheses make it hard to read and it's easy to make mistakes and there is no jumptable optimization. Adding more cases will make the situation worse.
Backward Incompatible Changes
match
was added as a keyword (reserved_non_modifiers
). This means it can't be used in the following contexts anymore:
- namespaces
- class names
- function names
- global constants
Note that it will continue to work in method names and class constants.
Proposed PHP Version(s)
The proposed version is PHP 8.
Proposed Voting Choices
As this is a language change, a 2/3 majority is required. The vote is a straight Yes/No vote for accepting the RFC and merging the patch.