The switch
statement is a fundamental control structure in almost every programming language. Unfortunately, in PHP it has some long-standing issues that make it hard to use correctly, namely:
This RFC proposes a new control structure called match
to resolve these issues.
match ($condition) { 1 => { foo(); bar(); }, 2 => baz(), } $expressionResult = match ($condition) { 1, 2 => foo(), 3, 4 => bar(), default => baz(), };
We're going to take a look at each issue and how the new match
expression resolves them.
The switch
statement loosely compares (==
) the given value to the case values. This can lead to some very surprising results.
switch ('foo') { case 0: echo "Oh no!\n"; break; }
The match
expression uses strict comparison (===
) instead. The comparison is strict regardless of strict_types
.
match ('foo') { 0 => { echo "Never reached\n"; }, }
It is very common that the switch
produces some value that is used afterwards.
switch (1) { case 0: $y = 'Foo'; break; case 1: $y = 'Bar'; break; case 2: $y = 'Baz'; break; } echo $y; //> Bar
It is easy to forget assigning $y
in one of the cases. It is also visually unintuitive to find $y
declared in a deeper nested scope. match
is an expression that evaluates to the result of the executed arm. This removes a lot of boilerplate and makes it impossible to forget assigning a value in an arm.
echo match (1) { 0 => 'Foo', 1 => 'Bar', 2 => 'Baz', }; //> Bar
The switch
fallthrough has been a large source of bugs in many languages. Each case
must explicitly break
out of the switch
statement or the execution will continue into the next case
even if the condition is not met.
switch ($pressedKey) { case Key::RETURN_: save(); // Oops, forgot the break case Key::DELETE: delete(); break; }
This was intended to be a feature so that multiple conditions can execute the same block of code. It is often hard to understand if the missing break
was the authors intention or a mistake.
switch ($x) { case 1: case 2: // Same for 1 and 2 break; case 3: // Only 3 case 4: // Same for 3 and 4 break; }
The match
expression resolves this problem by adding an implicit break
after every arm. Multiple conditions can be comma-separated to execute the same block of code. There's no way to achieve the same result as 3 and 4 in the example above without an additional if
statement. This is a little bit more verbose but makes the intention very obvious.
match ($x) { 1, 2 => { // Same for 1 and 2 }, 3, 4 => { if ($x === 3) { // Only 3 } // Same for 3 and 4 }, }
Another large source of bugs is not handling all the possible cases supplied to the switch
statement.
switch ($configuration) { case Config::FOO: // ... break; case Config::BAR: // ... break; }
This will go unnoticed until the program crashes in a weird way, causes strange behavior or even worse becomes a security hole. Many languages can check if all the cases are handled at compile time or force you to write a default
case if they can't. For a dynamic language like PHP the only alternative is throwing an error. This is exactly what the match
expression does. It throws an UnhandledMatchError
if the condition isn't met for any of the arms.
match ($x) { 1 => ..., 2 => ..., } // $x can never be 3
Sometimes passing a single expression to a match arm isn't enough, either because you need to use a statement or the code is just too long for a single expression. In those cases you can pass a block to the arm.
match ($x) { 0 => { foo(); bar(); baz(); }, }
Originally this RFC included a way to return a value from a block by omitting the semicolon of the last expression. This syntax is borrowed from Rust 1). Due to memory management difficulties and a lot of negative feedback on the syntax this is no longer a part of this proposal and will be discussed in a separate RFC.
// Original proposal $y = match ($x) { 0 => { foo(); bar(); baz() // This value is returned }, }; // Alternative syntax, <= $y = match ($x) { 0 => { foo(); bar(); <= baz(); }, }; // Alternative syntax, separate keyword $y = match ($x) { 0 => { foo(); bar(); pass baz(); }, }; // Alternative syntax, automatically return last expression regardless of semicolon $y = match ($x) { 0 => { foo(); bar(); baz(); }, };
For the time being using blocks in match expressions that use the return value in any way results in a compilation error:
$y = match ($x) { 0 => {}, }; //> Match that is not used as a statement can't contain blocks foo(match ($x) { 0 => {}, }); //> Match that is not used as a statement can't contain blocks 1 + match ($x) { 0 => {}, }; //> Match that is not used as a statement can't contain blocks //etc. // Only allowed form match ($x) { 0 => {}, }
When using match
as part of some other expression it is necessary to terminate the statement with a semicolon.
$x = match ($y) { ... };
The same would usually be true if the match
expression were used as a standalone expression.
match ($y) { ... };
However, to make the match
expression more similar to other statements like if
and switch
we could allow dropping the semicolon in this case only.
match ($y) { ... }
This introduces an ambiguity with the +
and -
unary operators.
match ($y) { ... } -1; // Could be parsed as // 1 match ($y) { ... }; -1; // 2 match ($y) { ... } - 1;
A match
that appears as the first element of a statement would always be parsed as option 1 because there are no legitimate use cases for binary operations at a statement level. All other cases work as expected.
// These work fine $x = match ($y) { ... } - 1; foo(match ($y) { ... } - 1); $x[] = fn($y) => match ($y) { ... }; // etc.
This is also how Rust solves this ambiguity2).
match true { _ => () } - 1; // warning: unused unary operation that must be used // --> src/main.rs:2:28 // | // 2 | match true { _ => () } - 1; // | ^^^ // |
Because there was some controversy around this feature it was moved to a secondary vote.
It has been suggested to make no condition equivalent to (true)
.
match { $age >= 30 => {}, $age >= 20 => {}, $age >= 10 => {}, default => {}, } // Equivalent to match (true) { $age >= 30 => {}, $age >= 20 => {}, $age >= 10 => {}, default => {}, }
The keyword match
could be a bit misleading here. A potential gotcha is passing truthy values to the match which will not work as intended. But of course this issue remains regardless of dropping (true)
or not.
match { preg_match(...) => {}, // preg_match returns 1 which is NOT identical (===) to true }
Because I have no strong opinion on this it will be moved to a secondary vote.
A match condition can be any arbitrary expression. Analogous to switch
each condition will be checked from top to bottom until the first one matches. If a condition matches the remaining conditions won't be evaluated.
match ($x) { foo() => ..., $this->bar() => ..., // bar() isn't called if foo() matched with $x $this->baz => ..., // etc. }
Just like with the switch you can use break
to break out of the executed arm.
match ($x) { $y => { if ($condition) { break; } // Not executed if $condition is true }, }
Unlike the switch using continue
targeting the match
expression will trigger a compilation error.
match ($i) { default => { continue; }, } //> Fatal error: "continue" targeting match is disallowed. Did you mean to use "break" or "continue 2"?
Like with the switch
you can use goto
to jump out of match
expressions.
match ($a) { 1 => { match ($b) { 2 => { goto end_of_match; }, } }, } end_of_match:
It is not allowed to jump into match expressions.
goto match_arm; match ($b) { 1 => { match_arm: }, } //> Fatal error: 'goto' into loop, switch or match is disallowed
return
behaves the same as in any other context. It will return from the function.
function foo($x) { match ($x) { 1 => { return; }, } // Not executed if $x is 1 }
As mentioned above block expressions will be discussed in a separate RFC. We'll also use this opportunity to think about blocks in arrow functions.
I have experimented with pattern matching 3) and decided not to include it in this RFC. Pattern matching is a complex topic and requires a lot of thought. Each pattern should be discussed in detail in a separate RFC.
// With pattern matching match ($value) { let $a => ..., // Identifer pattern let 'foo' => ..., // Scalar pattern let 0..<10 => ..., // Range pattern let is string => ..., // Type pattern let [1, 2, $c] => ..., // Array pattern let Foo { foo: 1, getBar(): 2 } => ..., // Object pattern let $str @ is string if $str !== '' => ..., // Guard // Algebraic data types if we ever get them let Ast::BinaryExpr($lhs, '+', $rhs) => ..., } // Without pattern matching match (true) { true => $value ..., // Identifier pattern 'foo' => ..., // Scalar pattern $value >= 0 && $value < 10 => ..., // Range pattern is_string($value) => ..., // Type pattern count($value) === 3 && isset($value[0]) && $value[0] === 1 && isset($value[1]) && $value[1] === 2 && isset($value[2]) => $value[2] ..., // Array pattern $value instanceof Foo && $value->foo === 1 && $value->getBar() === 2 => ..., // Object pattern is_string($str) && $str !== '' => ..., // Guard }
Some people have suggested allowing explicit fallthrough to the next arm. This is, however, not a part of this RFC.
match ($x) { 1 => { foo(); fallthrough; }, 2 => { bar(); }, } // 1 calls foo() and bar() // 2 only calls bar()
This would require a few sanity checks with pattern matching.
match ($x) { $a => { fallthrough; }, $b => { /* $b is undefined */ }, }
if ($x === 1) { $y = ...; } elseif ($x === 2) { $y = ...; } elseif ($x === 3) { $y = ...; }
Needless to say this is incredibly verbose and there's a lot of repetition. It also can't make use of the jumptable optimization. You must also not forget to write an else
statement to catch unexpected values.
$y = [ 1 => ..., 2 => ..., ][$x];
This code will execute every single “arm”, not just the one that is finally returned. It will also build a hash map in memory every time it is executed. And again, you must not forget to handle unexpected values.
$y = $x === 1 ? ... : ($x === 2 ? ... : ($x === 3 ? ... : 0));
The parentheses make it hard to read and it's easy to make mistakes and there is no jumptable optimization. Adding more cases will make the situation worse.
match
was added as a keyword (reserved_non_modifiers
). This means it can't be used in the following contexts anymore:
Note that it will continue to work in method names and class constants.
Voting starts 2020-04-25 and ends 2020-05-09.
As this is a language change, a 2/3 majority is required.
Secondary vote (choice with the most votes is picked):
Secondary vote (choice with the most votes is picked):