rfc:arrow_functions

PHP RFC: Arrow Functions

This RFC is an alternative proposal to Bob Weinand's Short Closures RFC which contains a lot of relevant information. This RFC will reference the short_closures RFC several times so readers should be familiar with it.

Introduction

Anonymous functions and closures can be verbose even though sometimes they are quite simple and contain only a single expression. Additionally, importing variables into the closure's scope is manual and is painful overhead for single-expression closures. In practice these single-expression closures are common. This RFC proposes a more concise syntax for this pattern.

As an example of the declaration overhead, consider this function that I found online:

function array_values_from_keys($arr, $keys) {
    return array_map(function ($x) use ($arr) { return $arr[$x]; }, $keys);
}

The closure performs a single operation $arr[$x] and is 8 characters but requires 30 other characters (excluding whitespace). This means that roughly 79% of the closure's code is overhead (30/38). For this RFC these extra characters are called 'boilerplate'. This RFC proposes arrow functions to reduce the amount of boilerplate by having a shorter syntax and importing used variables from the outer scope implicitly. Using arrow functions from this RFC this would reduce to the following:

function array_values_from_keys($arr, $keys) {
    return array_map(fn($x) => $arr[$x], $keys);
}

This reduces the amount of boilerplate from 30 characters down to 8.

See more examples in the Examples section. The longer examples may be helpful to those struggling to understand why the RFC authors care about saving symbols and clarity on each closure.

Many other languages have ways to write closures in a succinct form as well. TODO: decide how much summary of this topic should be given here. At minimum provide links to closure documentation for a few other relevant and major languages?

Proposal

Arrow functions have the following form:

fn(parameter_list) => expr

The expr is a single expression. This expression will be evaluated and then the result will be returned:

$mul2 = fn($x) => $x * 2;
 
$mul2(3); // evaluates to 6

When a variable in the expression is defined in the parent scope it will be captured implicitly by-value. In the following example the functions identified by $versionA and $versionB are exactly equivalent:

$y = 1;
 
$versionA = fn($x) => $x + $y;
 
$versionB = function ($x) use ($y) {
    return $x + $y;
};

Note that searching for variables to close over will descend into nested arrow functions and use sections of inline functions. This functionality is not expected to be common but is supported.

Arrow functions are similar to those found in EcmaScript 2015 (ES6)1) and lambda expressions from C#2).

Type Declarations

This RFC does support type declarations for parameters and return types. This issue was noted multiple times on the mailing list during the short closures RFC as something that bothered voters. Therefore this RFC permits them but the authors discourage their general use in arrow functions.

Here are some examples to show the syntax:

fn (array $x) => $x
fn (): int => 42

References

Parameters and return values can be passed/returned by reference. As mentioned elsewhere, implicitly bound variables will be bound by value and not by reference. References go in the usual places:

fn &(array &$xs) => $xs

Static Arrow Functions

The implementation currently supports static closures, for example static fn($x) => static::get($x). While supported it is uncertain whether it should be included in the final version. Having the implementation support it allows testers to determine usefulness and value.

Ambiguities

Arrow functions have no ambiguities, including array key definitions and yield expressions that provide a key. The fn prefix removes the ambiguities.

Backward Incompatible Changes

Unfortunately the fn keyword must be a full keyword and not just a reserved function name; this is to break the ambiguities with => for array and yield keys.

Ilija Tovilo analyzed the top 1,000 PHP repositories on GitHub to find usages of fn. The gist provides more information, but the rough findings are that all known existing usages of fn are in tests except one case where it is a namespace segment.

Patches and Tests

An implementation with tests can be found here: https://github.com/morrisonlevi/php-src/tree/arrow_functions. There are no known issues with it at this time; please build and test it.

Voting

Voting will be a simple Yes/No that requires 2/3 or more of the votes to be “Yes” to be accepted.


Accept arrow functions? (2/3 required)
Real name Yes No
Final result: 0 0
This poll has been closed.

Examples

Taken from silexphp/Pimple:

$extended = function ($c) use ($callable, $factory) {
    return $callable($factory($c), $c);
};
 
// with arrow function:
$extended = fn($c) => $callable($factory($c), $c);

This reduces the amount of boilerplate from 44 characters down to 8.


Taken from Doctrine DBAL:

$this->existingSchemaPaths = array_filter($paths, function ($v) use ($names) {
    return in_array($v, $names);
});
 
// with arrow function
$this->existingSchemaPaths = array_filter($paths, fn($v) => in_array($v, $names));

This reduces the amount of boilerplate from 31 characters down to 8.


The complement function as found in many libraries:

function complement(callable $f) {
    return function (... $args) use ($f) {
        return !$f(... $args);
    };
}
 
// with arrow function:
function complement(callable $f) {
    return fn(... $args) => !$f(... $args);
}

The following example was given to me by tpunt:

$result = Collection::from([1, 2])
    ->map(function ($v) {
        return $v * 2;
    })
    ->reduce(function ($tmp, $v) {
        return $tmp + $v;
    }, 0);
 
echo $result; //6
 
// with arrow functions:
$result = Collection::from([1, 2])
    ->map(fn($v) => $v * 2)
    ->reduce(fn($tmp, $v) => $tmp + $v, 0);
 
echo $result; //6

Future Scope: Multi-Statement Bodies

Some languages permit multi-statement closures with a syntax like:

(parameter_list) => {
    stmt1;
    stmt2;
    //…
}

In this case nothing would be automatically returned. This feature was included in the short closures RFC but there were two primary complaints about it:

  • If you are specifying multiple statements doesn't that work against the purpose of being concise and short?
  • Auditing the implicitly bound variables becomes more difficult as the number of statements increase.

This RFC omitted this feature for these reasons. If arrow functions are accepted and become more common it may make sense to revisit this feature.

rfc/arrow_functions.txt · Last modified: 2018/06/28 14:35 by levim