rfc:short_closures

This is an old revision of the document!


PHP RFC: Short Closures

Introduction

Anonymous functions, also known as closures, allow the creation of functions which have no specified name. They are most useful as the value of callback parameters, but they have many other uses.

The current implementation of anonymous functions in PHP is quite verbose compared to other languages. That makes using anonymous functions be more difficult than it could be, as there is both more to type, and more importantly the current implementation makes it hard to read (and so maintain) code that uses anonymous functions.

A better syntax encourages functional code and partial applications (see the examples), which are a powerful tools people writing PHP code should be able to use as easily as they can be used elsewhere.

Proposal

This RFC proposes the introduction of the ~> operator to allow shorthand creation of anonymous functions to reduce the amount of 'boilerplate' needed to use them.

Current code:

function ($x) {
    return $x * 2;
}

would be equivalent to the new syntax:

$x ~> $x * 2

Anonymous functions defined in this way will automatically use () all of the (compiled) variables in the Closure body. See the 'Variable binding' section for more details.

Syntax

The syntax used to define a short hand anonymous function would be:

  • Parameters. When the function has a single parameter the surrounding parentheses (aka round brackets) may be omitted. For functions with multiple parameters the parentheses are required.
  • The new short closure operator ~>
  • The body of the anonymous function. When the body of the function is a single expression the surrounding curly brackets and return keyword may be omitted. When the body of the function is not a single expression, the braces (and eventual return statement) are required.

I.e. all of the following would be equivalent:

$x ~> $x * 2
$x ~> { return $x * 2;}
($x) ~> $x * 2
($x) ~> { return $x * 2; }

Omitting the parentheses when the function has multiple parameters will result in a parse error:

$x, $y ~> {$x + $y}  // Unexpected ','

Using the return keyword when braces have been omitted, will similarly give a parse error:

($x, $y) ~> return $x + y; // Unexpected T_RETURN

Concrete syntax is (~> is right associative with lowest possible associativity):

  ( parameter_list ) ~> expression
| ( parameter_list ) ~> { statements }
/* return by reference */
| &( parameter_list ) ~> expression
| &( parameter_list ) ~> { statements }
/* shorthand form for just one parameter */
| $variable ~> expression
| $variable ~> { statements }

When a bare expression is used as second parameter, its result will be the return value of the Closure.

Variable binding

The position of this RFC is that the shorthand syntax is to allow anonymous functions to be used as easily as possible. Therefore, rather than requiring individual variables be bound to the closure through the use ($x) syntax, instead all variables used in the body of the anonymous function will automatically be bound to the anonymous function closure from the defining scope.

For example:

$a = 1;
function foo(array $input, $b) {
    $c = rand(0, 4);
 
    return array_map($x ~> ($x * 2) + $b + $c, $input);
}

Variables $b and $c would be bound automatically to the anonymous function, and so be usable inside it. Variable $a is not in the scope of the function, and so is not bound, and so cannot be used inside the closure. e.g. this code will give an error:

$a = 1;
function foo(array $input, $b) {
    // Notice: Undefined variable: a in %s on line %d
    return array_map($x ~> ($x * 2) + $b + $a, $input);
}

If a user wants to avoid binding all variables automatically they can use the current syntax to define the anonymous function.

Examples

These examples cover some simple operations and show how the short-hand syntax is easier to read compared to the existing long-hand syntax.

Array sort with user function

Sort $array which contains objects which have a property named val in reverse.

Current syntax:

usort($array, 
	function($a, $b) {
		return -($a->val <=> $b->val); 
	}
);

New syntax:

usort($array, ($a, $b) ~> -($a->val <=> $b->val));

Extracting data from an array and summing it

Current syntax:

function sumEventScores($events, $scores) {
    $types = array_map(
        function ($event) {
            return $event['type'];
        },
        $events
    );
 
    return array_reduce(
        $types,
        function ($sum, $type) use ($scores) {
            return $sum + $scores[$type];
        }
    );
}

New syntax:

function sumEventScores($events, $scores) {
    $types = array_map($event ~> $event['type'], $events);
    return array_reduce($types, ($sum, $type) ~> $sum + $scores[$type]);
}

The calling code for this function would be:

$events = array(
    array(
        'type' =>'CreateEvent',
        'date' => '2015-05-01T16:19:33+00:00'
    ),
    array(
        'type' =>'PushEvent',
        'date' => '2015-05-01T16:19:54+00:00'
    ),
    //...
);
 
$scores = [
    'PushEvent'          => 5,
    'CreateEvent'        => 4,
    'IssuesEvent'        => 3,
    'CommitCommentEvent' => 2
];
 
sumEventScores($events, $scores);

Partial application

The short hand syntax makes it easier to write functional code like a reducer by using the ability of shorthand anonymous functions to be chained together easily.

Current syntax:

function reduce(callable $fn) {
    return function($initial) use ($fn) {
        return function ($input) use ($fn, $initial) {
            $accumulator = $initial;
            foreach ($input as $value) {
                $accumulator = $fn($accumulator, $value);
            }
            return $accumulator;
        };
    };
}

New syntax:

function reduce(callable $fn) {
    return $initial ~> $input ~> {
        $accumulator = $initial;
        foreach ($input as $value) {
            $accumulator = $fn($accumulator, $value);
        }
        return $accumulator;
    };
}

Symbol choice

The symbol ~> was chosen as it is a mnemonic device to help programmers understand that the variable is being brought to a function. It is also unambiguous as it has not been used elsewhere in PHP.

Currently Hack has implemented shorthand anonymous functions using the ==> symbol to define them. The position of this RFC is that the ==> symbol is too similar to the => (double arrow) sign, and would cause confusion. Either through people thinking it has something to do with key-value pairs, or through a simple typo could produce valid but incorrect code. e.g.

This returns an array containing an anonymous function:

return [$x ==> $x * 2];

This returns an array if $x is already a defined variable.

return [$x => $x * 2];

Backward Incompatible Changes

This RFC doesn't affect backwards compatibility.

Proposed PHP Version(s)

Next PHP 7.x; actually 7.1.

Future Scope

Other uses for ~> operator

This RFC is solely for using the shorthand anonymous functions as closures. It does not cover any other usage of the shorthand function definition such as:

class Foo {
    private $bar:
 
    getBar ~> $this->bar;
    setBar($bar) ~> $this->bar = $bar;
}

Which is outside the scope of this RFC.

Type Hints and Return Types

This RFC does not include type hints nor return types.

Type Hints are not added due to technical problems in parser and the RFC author is not sure about whether they should be really added. If anyone achieves to solve these technical issues, he should feel free to do that in a future RFC for further discussion. And as introducing half a typesystem would be inconsistent, the RFC proposes to not include return types either.

As an alternative, the current syntax for defining Closures still can be used here.

Proposed Voting Choices

This RFC is a language change and as such needs a 2/3 majority.

It will be a simple yes/no vote.

Patch

rfc/short_closures.1430506020.txt.gz · Last modified: 2017/09/22 13:28 (external edit)