Table of Contents

PHP RFC: Generator Return Expressions

Abstract

PHP's generators are unequivocally useful both for iteration and cooperative multi-tasking. However, the inability of generator functions to specify return values artificially limits their usefulness for multitasking in coroutine contexts. This RFC proposes the ability to both specify and access Generator return values while laying the groundwork for future sub-generator returns. The proposal is a prerequisite for the conceptually related Generator Delegation RFC.

Status Quo

Generators as currently implemented trigger a fatal error if a return statement specifies an associated expression:

<?php
function foo() {
    yield 0;
    yield 1;
 
    // Fatal error: Generators cannot return values using "return" in %s on line %d
    return 42;
}

Proposal

Code Examples Using Proposed Behavior

Calling Generator::getReturn() to retrieve the return value after iteration:

<?php
function foo() {
    yield 1;
    yield 2;
    return 42;
}
 
$bar = foo();
foreach ($bar as $element) {
    echo $element, "\n";
}
 
var_dump($bar->getReturn());
 
// 1
// 2
// int(42)

Calling Generator::getReturn() in the absence of a return statement:

<?php
function foo() {
    yield 1;
    yield 2;
    yield 3;
}
 
$bar = foo();
while($bar->valid()) {
    $bar->next();
}
 
assert($bar->getReturn() === null);

Calling Generator::getReturn() while the generator is still valid:

<?php
function foo() {
    yield 1;
    yield 2;
    return 42;
}
 
$bar = foo();
$bar->current();
$bar->next();
 
assert($bar->valid());
 
// Throws an exception because the generator is still valid
$returnValue = $bar->getReturn();

Calling Generator::getReturn() after the generator has thrown:

<?php
function foo() {
    throw new Exception;
    yield 1;
    yield 2;
    return 42;
}
 
$bar = foo();
 
set_exception_handler(function($e1) use ($bar) {
    try {
        $bar->getReturn();
    } catch (Exception $e2) {
        // Generator::getReturn() threw; trying to use a return
        // value from a generator that didn't actually complete
        // is a logic error we want to prevent.
    }
});
 
$bar->next();

Use-Case: Coroutine Return Values

Generators are particularly useful for their ability to suspend execution and resume at a later time. This capacity allows applications to cooperatively multitask discrete units of processing work. However, the inability to explicitly return values leaves coroutines in a situation where they're able to process concurrent tasks but have no standard way to access the results of those computations. Consider:

<?php
$gen = function {
    $foo = yield myAsyncFoo(); // resume here when promised result returns
    $bar = yield myAsyncBar($foo); // resume here when promised result returns
 
    // Relying on the final yield as the "return" value here is ambiguous
    yield $bar + 42;
};

In the above code we can assume the final yield is the “return” value but this is difficult to read and further it may not be the actual intent of the generator author. Userland code can currently work around this limitation to make such “returns” more explicit using the key => value yield form:

<?php
$gen = function {
    $foo = yield myAsyncFoo();
    $bar = yield myAsyncBar($foo);
    yield "return" => $bar + 42;
};

The above example takes advantage of “meta data” about the yielded value in the form of the yielded key. While this approach can work to indicate intent and make code more readable it suffers from the failing that it is non-standard and concurrency frameworks are forced to fractal out their own domain-specific conventions for representing asynchronous coroutine execution results.

Generator return expressions as proposed here alleviate this problem as return statements have applicable semantics, known characteristics and low cognitive overhead:

<?php
$gen = function {
    $foo = yield myAsyncFoo();
    $bar = yield myAsyncBar($foo);
    return $bar + 42; // unambiguous execution "result"
};
?>

Reference Returns

Generators currently utilize the `&` operator to indicate values will be yielded by-reference:

<?php
function &gen_reference() {
    $value = 3;
 
    while ($value > 0) {
        yield $value;
    }
}
 
foreach (gen_reference() as &$number) {
    echo (--$number).'... ';
}
?>

As function& is already in place to modify yield values we have no way to differentiate between by-reference yield values and by-reference return values. While it would be possible to use function& to mark returns as by-reference, this proposal's position is that no correlation exists between the “reference-ness” of yielded values and return values. Instead of introducing new tokens or syntax to allow by-reference returns this proposal always uses ZVAL_COPY on return zvals. In short: by-reference returns are not supported in generator return expressions.

Other Languages

Other popular dynamic languages currently support generator return expressions ...

Generator Returns in Python

Python 3.3 added the ability to return expressions in sub-generators and have these results returned to the parent generator.

def foo():
    return 1
    yield 2 # never reached
 
def bar():
    x = yield from foo()
    print(x)
 
bar() # outputs 1

Generator Returns in Javascript

Javascript ES6 generators yield objects with a value property enumerating the yielded value and a boolean done property to indicate when iteration has completed:

function *foo(x) {
    var y = 2 * (yield (x + 1));
    var z = yield (y / 3);
    return (x + y + z);
}

Backward Incompatible Changes

None

Proposed PHP Version(s)

PHP7

Unaffected PHP Functionality

Existing generator semantics remain unmodified. Only new functionality is added to allow generators to return expressions and differentiate these expressions from standard yields via the new Generator::getReturn() method.

Future Scope

The proposed behavior lays the groundwork for future sub-generator functionality using yield from to break apart functional units into multiple generators. In such cases a sub-generator's return value is sent to the parent generator upon completed iteration. return expression capability is needed to implement this behavior in future PHP versions.

A brief example of how yield from might be implemented in the future using the return expressions proposed in the current RFC:

<?php
function foo() {
    yield 1;
    return yield from bar();
}
 
function bar() {
    yield 2;
    yield 3;
    return 4;
}
 
$baz = foo();
foreach ($baz as $element) {
    echo $element, "\n";
}
echo $baz->getReturn(), "\n";
 
// 1
// 2
// 3
// 4

Vote

Voting begins 2015-03-09 and ends on 2015-03-16.

A 2/3 “Yes” vote is required to implement this proposal.

Allow Generator return expressions in PHP7
Real name Yes No
aharvey (aharvey)  
ashnazg (ashnazg)  
bishop (bishop)  
brandon (brandon)  
bwoebi (bwoebi)  
Damien Tournoud (damz)  
datibbaw (datibbaw)  
daverandom (daverandom)  
diegopires (diegopires)  
dm (dm)  
dragoonis (dragoonis)  
duodraco (duodraco)  
eliw (eliw)  
indeyets (indeyets)  
jedibc (jedibc)  
jmikola (jmikola)  
jpauli (jpauli)  
jwage (jwage)  
kinncj (kinncj)  
kk (kk)  
krakjoe (krakjoe)  
lcobucci (lcobucci)  
leigh (leigh)  
lstrojny (lstrojny)  
mbeccati (mbeccati)  
mike (mike)  
nikic (nikic)  
pauloelr (pauloelr)  
pollita (pollita)  
rdlowrey (rdlowrey)  
salathe (salathe)  
stelianm (stelianm)  
thekid (thekid)  
Final result: 33 0
This poll has been closed.

. Vote closed at 14:50 UTC 2015-03-17.

NOTE: the related Generator Delegation RFC depends on the acceptance of this proposal.

Patches and Tests

https://github.com/php/php-src/pull/1096

The linked patch was written by Nikita Popov and is intended as “final” excepting any changes that may arise during RFC discussion.

Implementation

TBD

References

New in Python 3.3: yield from