rfc:arrow_functions

This is an old revision of the document!


PHP RFC: Arrow Functions

This RFC is an alternative proposal to Bob Weinand's Short Closures RFC which contains a lot of relevant information. This RFC will reference the short_closures RFC several times so readers should be familiar with it.

Introduction

Anonymous functions and closures can be verbose even though sometimes they are quite simple and contain only a single expression. Additionally, importing variables into the closure's scope is manual and is painful overhead for single-expression closures. This RFC proposes a shorter syntax for anonymous functions and closures and makes it easier for these new closures to capture values/variables.

As an example of the declaration overhead, consider this function that I found online:

function array_values_from_keys($arr, $keys) {
    return array_map(function($x) use ($arr) { return $arr[$x]; }, $keys);
}

The closure performs a single operation $arr[$x] and is 8 characters but requires 30 other characters (excluding whitespace). This means that roughly 79% of the closure's code is overhead (30/38). For this RFC these extra characters are called 'boilerplate'. This RFC proposes arrow functions to reduce the amount of boilerplate by having a shorter syntax and importing used variables from the outer scope implicitly. Using all possible features proposed by this RFC this would reduce to the following:

function array_values_from_keys($arr, $keys) {
    return array_map($x => $arr[$x], $keys);
}

This reduces the amount of boilerplate from 30 characters down to 5.

See more examples in the Examples section. The longer examples may be helpful to those struggling to understand why the RFC authors care about saving symbols and clarity on each closure.

Many other languages have ways to write closures in a succinct form as well. TODO: decide how much summary of this topic should be given here. At minimum provide links to closure documentation for a few other relevant and major languages?

Proposal

Arrow functions have a few forms:

(parameter_list) => expr
singleParam => expr
() => expr

The expr is a single expression in all cases. This expression will be evaluated and then the result will be returned:

$mul2 = ($x) => $x * 2;
 
$mul2(3); // evaluates to 6

If there is only a single parameter for the arrow function then the parenthesis around the parameter list can be omitted:

$mul2 = $x => $x * 2;
 
$mul2(3); // evaluates to 6

If there are no parameters then the parentheses are required as probably expected:

$lazy_factory = () => gen_object();

When a variable in the expression is defined in the parent scope it will be captured implicitly by-value. In the following example the functions identified by $versionA and $versionB are exactly equivalent:

$y = 1;
 
$versionA = $x => $x + $y;
 
$versionB = function($x) use($y) {
    return $x + $y;
};

Note that searching for variables to close over will descend into nested arrow functions and use sections of inline functions. This functionality is not expected to be common but is supported.

Arrow functions are similar to those found in EcmaScript 2015 (ES6)1) and lambda expressions from C#2).

Type Declarations

This RFC does not permit type declarations for parameters and return types. This issue was noted multiple times on the mailing list during the short closures RFC as something that bothered voters. However, the main purpose of arrow functions is to focus on the functionality by removing boilerplate. Arrow functions have very small bodies because they are single expressions. This makes them easy to audit and type information is not expected to be helpful.

Ambiguities

Arrow functions have ambiguities with array key definitions and yield expressions that provide a key.

// Does this mean:
//   1. Create an array key of`$x` and a value with `$x * 2`
//   2. Create an array with one value that is an anonymous function
[$x => $x * 2]
 
// Does this mean:
//   1. Yield a key of `$x` and a value with `$x * 2`
//   2. Yield an anonymous function
yield $x => $x * 2;

These ambiguities are solved by preferring the existing meanings. To obtain the semantics of arrow functions wrap the arrow function in parenthesis. For example:

// Create an array key of`$x` and a value with `$x * 2`
[$x => $x * 2];
 
// Create an array with one member that is an arrow function
[($x => $x * 2)];
 
// Yield a key of `$x` and a value with `$x * 2`
yield $x => $x * 2;
 
// Yield an anonymous function
yield ($x => $x * 2);

Backward Incompatible Changes

There are no backwards incompatible changes.

Patches and Tests

An old implementation with tests can be found here: https://github.com/morrisonlevi/php-src/tree/arrow_functions. This patch was feature-complete for an old version of this RFC that used ^ to prefix the function expressions. The implementation can still be used but portion that deals with the grammar must be rewritten.

PHP Version

This RFC targets PHP 7.NEXT, currently version 7.2.

Voting

Voting will be a simple Yes/No that requires 2/3 or more of the votes to be “Yes” to be accepted.


Accept arrow functions? (2/3 required)
Real name Yes No
Final result: 0 0
This poll has been closed.

Examples

Snippets

Taken from silexphp/Pimple:

$extended = function ($c) use ($callable, $factory) {
    return $callable($factory($c), $c);
};
 
// with arrow function:
$extended = $c => $callable($factory($c), $c);

This reduces the amount of boilerplate from 44 characters down to 4.


Taken from Doctrine DBAL:

$this->existingSchemaPaths = array_filter($paths, function ($v) use ($names) {
    return in_array($v, $names);
});
 
// with arrow function
$this->existingSchemaPaths = array_filter($paths, $v => in_array($v, $names));

This reduces the amount of boilerplate from 31 characters down to 4.


The complement function as found in many libraries:

function complement(callable $f) {
    return function(... $args) use ($f) {
        return !$f(... $args);
    };
}
 
// with arrow function:
function complement(callable $f) {
    return (... $args) => !$f(... $args);
}

Longer Examples

Taken from Pinq's example in their README.md with only some slight modifications:

$youngPeopleDetails = $people
    ->where(function ($row) use($maxAge) { return $row['age'] <= $maxAge; })
    ->orderByAscending(function ($row) { return $row['firstName']; })
    ->thenByAscending(function ($row) { return $row['lastName']; })
    ->take(50)
    ->indexBy(function ($row) { return $row['phoneNumber']; })
    ->select(function ($row) { 
        return [
            'fullName'    => $row['firstName'] . ' ' . $row['lastName'],
            'address'     => $row['address'],
            'dateOfBirth' => $row['dateOfBirth'],
        ]; 
    });

With arrow functions:

$youngPeopleDetails = $people
    ->where($row => $row['age'] <= $maxAge)
    ->orderByAscending($row => $row['firstName'])
    ->thenByAscending(row => $row['lastName'])
    ->take(50)
    ->indexBy($row => $row['phoneNumber'])
    ->select($row => [
        'fullName'    => $row['firstName'] . ' ' . $row['lastName'],
        'address'     => $row['address'],
        'dateOfBirth' => $row['dateOfBirth'],
    ]);

The following examples were given to me by tpunt:

$result = Collection::from([1, 2])
    ->map(function($v) {
        return $v * 2;
    })
    ->reduce(function($tmp, $v) {
        return $tmp + $v;
    }, 0);
 
echo $result; //6
 
// with arrow functions:
$result = Collection::from([1, 2])
    ->map($v => $v * 2)
    ->reduce(($tmp, $v) => $tmp + $v, 0);
 
echo $result; //6

Here is with our current closures:

function groupByKey($collection, $key)
{
    $generatorFactory = function () use ($collection, $key) {
        return groupBy(
            filter(
                $collection,
                function ($item) use ($key) {
                    return isCollection($item) && has($item, $key);
                }
            ),
            function($value) use ($key) {
                return get($value, $key);
            }
        );
    };
 
    return new Collection($generatorFactory);
}

And with arrow functions:

function groupByKey($collection, $key)
{
    $generatorFactory =
        () => groupBy(
            filter(
                $collection,
                $item => isCollection($item) && has($item, $key)
            ),
            $value => get($value, $key);
        );
 
    return new Collection($generatorFactory);
}

Future Scope

Multi-Statement Bodies

Some languages permit multi-statement closures with a syntax like:

(parameter_list) => {
    stmt1;
    stmt2;
    //…
}

In this case nothing would be automatically returned. This feature was included in the short closures RFC but there were two primary complaints about it:

  • If you are specifying multiple statements doesn't that work against the purpose of being concise and short?
  • Auditing the implicitly bound variables becomes more difficult as the number of statements increase.

This RFC omitted this feature for these reasons. If arrow functions are accepted and become more common it may make sense to revisit this feature.

rfc/arrow_functions.1474999670.txt.gz · Last modified: 2017/09/22 13:28 (external edit)