rfc:nullsafe_operator

PHP RFC: Nullsafe operator

Introduction

This RFC proposes a new operator nullsafe operator ?-> with full short circuiting.

Proposal

It is fairly common to only want to call a method or fetch a property on the result of an expression if it is not null.

Currently in PHP, checking for null leads to deeper nesting and repetition:

$country =  null;
 
if ($session !== null) {
    $user = $session->user;
 
    if ($user !== null) {
        $address = $user->getAddress();
 
        if ($address !== null) {
            $country = $address->country;
        }
    }
}
 
// do something with $country

With the nullsafe operator ?-> this code could instead be written as:

$country = $session?->user?->getAddress()?->country;
 
// do something with $country

When the left hand side of the operator evaluates to null the execution of the entire chain will stop and evalute to null. When it is not null it will behave exactly like the normal -> operator.

Short circuiting

Introduction

Short circuiting refers to skipping the evaluation of an expression based on some given condition. Two common examples are the operators && and ||. There are three ways the nullsafe operator ?-> could implement short circuiting. We’ll look at the same code snippet for every option.

null?->foo(bar())->baz();

1. Short circuiting for neither method arguments nor chained method calls

This complete lack of short circuiting is currently only found in Hack. Both the function bar() and the method baz() are called. baz() will cause a “Call to a member function on null” error. Evaluating method arguments makes it the most surprising of the three options. This was the primary criticism of the last RFC.

2. Short circuiting for method arguments but not chained method calls

This is what would normally be considered lack of short circuiting. The function bar() is not called, the method baz() is. baz() will cause a “Call to a member function on null” error.

3. Short circuiting for both method arguments and chained method calls

We’ll refer to this as full short circuiting. Neither the function bar() nor the method baz() are called. There will be no “Call to a member function on null” error.

Proposal

This RFC proposes full short circuiting. When the evaluation of one element in the chain fails the execution of the entire chain is aborted and the entire chain evaluates to null. The following elements are considered part of the chain.

  • Array access ([])
  • Property access (->)
  • Nullsafe property access (?->)
  • Static property access (::)
  • Method call (->)
  • Nullsafe method call (?->)
  • Static method call (::)
  • Assignment (=, +=, ??=, = &, etc.)
  • Post/pre increment (++, --)

The following elements will cause new sub-chains.

  • Right hand side of an assignment
  • Arguments in a function call
  • The expression in [] of an array access
  • The expression in {} when accessing properties (->{})

Chains are automatically inferred. Only the closest chain will terminate. The following examples will try to illustrate.

   $foo = $a?->b();
// --------------- chain 1
//        -------- chain 2
// If $a is null chain 2 is aborted, method b() isn't called, null is assigned to $foo
 
   $a?->b($c->d());
// --------------- chain 1
//        -------  chain 2
// If $a is null chain 1 is aborted, method b() isn't called, the expression `$c->d()` is not evaluated
 
   $a->b($c?->d());
// --------------- chain 1
//       --------  chain 2
// If $c is null chain 2 is aborted, method d() isn't called, null is passed to `$a->b()`
 
   $foo?->bar = $a->b();
// -------------------- chain 1
//              ------- chain 2
// If $foo is null chain 1 is aborted, `$a->b()` is not evaluated, the assignment is skipped
 
   $foo?->bar++;
// ------------ chain 1
// If $foo is null, chain 1 is aborted, ++ is skipped

Rationale

1. It avoids surprises

$foo = null;
$foo?->bar(expensive_function());

The evaluation of expensive_function() is undesirable if $foo is null as its result will simply be discarded. If the function has side effects it could also lead to surpsises.

2. You can see which methods/properties return null

$foo = null;
$foo?->bar()->baz();

Without short circuiting every subsequent method call and property access in the chain will require using the nullsafe operator or you will get a “Call to a member function on null” error. With short circuiting this isn’t necessary which makes it more obvious which methods/properties might return null.

3. Allows for nullsafe operator in write context

$foo = null;
$foo?->bar = 'bar';
var_dump($foo);
 
// Without short circuiting:
// Fatal error: Can't use nullsafe result value in write context
 
// With short circuiting:
// NULL

Without short circuiting the assignment to a nullsafe property would be illegal because it produces an r-value (a value that cannot be assigned to). With short circuiting if a nullsafe operation on the left hand side of the assignment fails the assignment is simply skipped.

4. Mixing with other operators

$foo = null;
$baz = $foo?->bar()['baz'];
var_dump($baz);
 
// Without short circuiting:
// Notice: Trying to access array offset on value of type null
// NULL
 
// With short circuiting
// NULL

Since with short circuiting the array access ['baz'] will be completely skipped no notice is emitted.

Other languages

Lets look the most popular high-level programming languages (according to the Stack Overflow 2020 survey) and our sister language Hack to see how the nullsafe operator is implemented.

Language Has nullsafe operatorSymbolHas short circuiting
JavaScript ?.
Python
Java
C# ?.
TypeScript ?.
Kotlin ?.
Ruby &.
Swift ?.
Rust
Objective-C ✗*
Dart ?.
Scala ✗†
Hack ?→ ✗‡

* In Object-C accessing properties and calling methods on nil is always ignored
† Possible via DSL
‡ Hack evaluates method arguments even if the left hand side of ?-> is null

8/13 languages have a nullsafe operator. 4/8 of those implement the nullsafe operator with short circuiting.

Syntax choice

The ? in ?-> denotes the precise place in the code where the short circuiting occurs. It closesly resembles the syntax of every other language that implements a nullsafe operator.

References

Taking the reference of a nullsafe chain is not allowed. This is because references require l-values (memory locations, like variables or properties) but the nullsafe operator can sometimes return the r-value null.

$x = &$foo?->bar;
 
// Could loosely be translated to
 
if ($foo !== null) {
    $x = &$foo->bar;
} else {
    $x = &null;
    // Only variables should be assigned by reference
}

For this reason, the following examples are disallowed.

// 1
$x = &$foo?->bar;
// Compiler error: Cannot take reference of a nullsafe chain
 
// 2
takes_ref($foo?->bar);
// Error: Cannot pass parameter 1 by reference
 
// 3
function &return_by_ref($foo) {
    return $foo?->bar;
    // Compiler error: Cannot take reference of a nullsafe chain
}

Example 2 is a runtime error because we cannot know at compile time if the given parameter allows passing values by reference.

Backward Incompatible Changes

There are no known backward incompatible changes in this RFC.

Future Scope

Since PHP 7.4 a notice is emitted on array access on null (null["foo"]). Thus the operator ?[] could also be useful ($foo?["foo"]). Unfortunately, this code introduces a parser ambiguity because of the ternary operator and short array syntax ($foo?["foo"]:["bar"]). Because of this complication the ?[] operator is not part of this RFC.

A nullsafe function call syntax ($callableOrNull?()) is also outside of scope for this RFC.

Vote

Voting starts 2020-07-15 and ends 2020-07-29.

Add nullsafe operator to the language?
Real name Yes No
Final result: 0 0
This poll has been closed.
rfc/nullsafe_operator.txt · Last modified: 2020/07/14 08:36 by ilutov