rfc:match_expression_v2

PHP RFC: Match expression v2

Proposal

This RFC proposes adding a new match expression that is similar to switch but with safer semantics and the ability to return values.

From the Doctrine query parser:

// Before
switch ($this->lexer->lookahead['type']) {
    case Lexer::T_SELECT:
        $statement = $this->SelectStatement();
        break;
 
    case Lexer::T_UPDATE:
        $statement = $this->UpdateStatement();
        break;
 
    case Lexer::T_DELETE:
        $statement = $this->DeleteStatement();
        break;
 
    default:
        $this->syntaxError('SELECT, UPDATE or DELETE');
        break;
}
 
// After
$statement = match ($this->lexer->lookahead['type']) {
    Lexer::T_SELECT => $this->SelectStatement(),
    Lexer::T_UPDATE => $this->UpdateStatement(),
    Lexer::T_DELETE => $this->DeleteStatement(),
    default => $this->syntaxError('SELECT, UPDATE or DELETE'),
};

Differences to switch

Return value

It is very common that the switch produces some value that is used afterwards.

switch (1) {
    case 0:
        $result = 'Foo';
        break;
    case 1:
        $result = 'Bar';
        break;
    case 2:
        $result = 'Baz';
        break;
}
 
echo $result;
//> Bar

It is easy to forget assigning $result in one of the cases. It is also visually unintuitive to find $result declared in a deeper nested scope. match is an expression that evaluates to the result of the executed arm. This removes a lot of boilerplate and makes it impossible to forget assigning a value in an arm.

echo match (1) {
    0 => 'Foo',
    1 => 'Bar',
    2 => 'Baz',
};
//> Bar

No type coercion

The switch statement loosely compares (==) the given value to the case values. This can lead to some very surprising results.

switch ('foo') {
    case 0:
      $result = "Oh no!\n";
      break;
    case 'foo':
      $result = "This is what I expected\n";
      break;
}
echo $result;
//> Oh no!

The match expression uses strict comparison (===) instead. The comparison is strict regardless of strict_types.

echo match ('foo') {
    0 => "Oh no!\n",
    'foo' => "This is what I expected\n",
};
//> This is what I expected

No fallthrough

The switch fallthrough has been a large source of bugs in many languages. Each case must explicitly break out of the switch statement or the execution will continue into the next case even if the condition is not met.

switch ($pressedKey) {
    case Key::RETURN_:
        save();
        // Oops, forgot the break
    case Key::DELETE:
        delete();
        break;
}

The match expression resolves this problem by adding an implicit break after every arm.

match ($pressedKey) {
    Key::RETURN_ => save(),
    Key::DELETE => delete(),
};

Multiple conditions can be comma-separated to execute the same block of code.

echo match ($x) {
    1, 2 => 'Same for 1 and 2',
    3, 4 => 'Same for 3 and 4',
};

Exhaustiveness

Another large source of bugs is not handling all the possible cases supplied to the switch statement.

switch ($operator) {
    case BinaryOperator::ADD:
        $result = $lhs + $rhs;
        break;
}
 
// Forgot to handle BinaryOperator::SUBTRACT

This will go unnoticed until the program crashes in a weird way, causes strange behavior or even worse becomes a security hole. match throws an UnhandledMatchError if the condition isn’t met for any of the arms. This allows mistakes to be caught early on.

$result = match ($operator) {
    BinaryOperator::ADD => $lhs + $rhs,
};
 
// Throws when $operator is BinaryOperator::SUBTRACT

Miscellaneous

Arbitrary expressions

A match condition can be any arbitrary expression. Analogous to switch each condition will be checked from top to bottom until the first one matches. If a condition matches the remaining conditions won’t be evaluated.

$result = match ($x) {
    foo() => ...,
    $this->bar() => ..., // bar() isn't called if foo() matched with $x
    $this->baz => ...,
    // etc.
};

Future scope

Blocks

In this RFC the body of a match arm must be an expression. Blocks for match and arrow functions will be discussed in a separate RFC.

Pattern matching

I have experimented with pattern matching and decided not to include it in this RFC. Pattern matching is a complex topic and requires a lot of thought. Each pattern should be discussed in detail in a separate RFC.

Allow dropping (true)

$result = match { ... };
// Equivalent to
$result = match (true) { ... };

Backward Incompatible Changes

match was added as a keyword (reserved_non_modifiers). This means it can’t be used in the following contexts anymore:

  • namespaces
  • class names
  • function names
  • global constants

Note that it will continue to work in method names and class constants.

Syntax comparison

Vote

Voting starts 2020-06-19 and ends 2020-07-03.

As this is a language change, a 2/3 majority is required.

Add match expressions to the language?
Real name Yes No
alec (alec)  
as (as)  
ashnazg (ashnazg)  
beberlei (beberlei)  
bmajdak (bmajdak)  
brzuchal (brzuchal)  
carusogabriel (carusogabriel)  
cpriest (cpriest)  
danack (danack)  
derick (derick)  
dm (dm)  
dmitry (dmitry)  
duodraco (duodraco)  
galvao (galvao)  
ilutov (ilutov)  
jasny (jasny)  
jbnahan (jbnahan)  
kalle (kalle)  
kelunik (kelunik)  
kguest (kguest)  
kocsismate (kocsismate)  
lcobucci (lcobucci)  
marandall (marandall)  
mariano (mariano)  
mgocobachi (mgocobachi)  
nicolasgrekas (nicolasgrekas)  
ocramius (ocramius)  
pajoye (pajoye)  
pollita (pollita)  
ralphschindler (ralphschindler)  
rdohms (rdohms)  
reywob (reywob)  
santiagolizardo (santiagolizardo)  
seld (seld)  
sergey (sergey)  
sirsnyder (sirsnyder)  
stas (stas)  
tandre (tandre)  
theodorejb (theodorejb)  
theseer (theseer)  
thijs (thijs)  
till (till)  
weierophinney (weierophinney)  
wyrihaximus (wyrihaximus)  
yunosh (yunosh)  
Final result: 43 2
This poll has been closed.
rfc/match_expression_v2.txt · Last modified: 2020/07/09 22:06 by ilutov