PHP's generators are unequivocally useful both for iteration and cooperative multi-tasking. However, the inability of generator functions to specify return values artificially limits their usefulness for multitasking in coroutine contexts. This RFC proposes the ability to both specify and access Generator return values while laying the groundwork for future sub-generator returns. The proposal is a prerequisite for the conceptually related Generator Delegation RFC.
Generators as currently implemented trigger a fatal error if a return statement specifies an associated expression:
<?php function foo() { yield 0; yield 1; // Fatal error: Generators cannot return values using "return" in %s on line %d return 42; }
return expressions.Generator::getReturn() to retrieve returned values.Generator::getReturn() while a generator is still valid will throw. This is consistent with the behavior seen when calling Generator::rewind() after iteration has already started. The logic behind this decision is that we should prevent calling code from mistaking a yet-to-be-computed return value (null) for the actual return value. This proposal's position is that invoking Generator::getReturn() on a still-valid generator (or one that has thrown) is a logic error.
Calling Generator::getReturn() to retrieve the return value after iteration:
<?php function foo() { yield 1; yield 2; return 42; } $bar = foo(); foreach ($bar as $element) { echo $element, "\n"; } var_dump($bar->getReturn()); // 1 // 2 // int(42)
Calling Generator::getReturn() in the absence of a return statement:
<?php function foo() { yield 1; yield 2; yield 3; } $bar = foo(); while($bar->valid()) { $bar->next(); } assert($bar->getReturn() === null);
Calling Generator::getReturn() while the generator is still valid:
<?php function foo() { yield 1; yield 2; return 42; } $bar = foo(); $bar->current(); $bar->next(); assert($bar->valid()); // Throws an exception because the generator is still valid $returnValue = $bar->getReturn();
Calling Generator::getReturn() after the generator has thrown:
<?php function foo() { throw new Exception; yield 1; yield 2; return 42; } $bar = foo(); set_exception_handler(function($e1) use ($bar) { try { $bar->getReturn(); } catch (Exception $e2) { // Generator::getReturn() threw; trying to use a return // value from a generator that didn't actually complete // is a logic error we want to prevent. } }); $bar->next();
Generators are particularly useful for their ability to suspend execution and resume at a later time. This capacity allows applications to cooperatively multitask discrete units of processing work. However, the inability to explicitly return values leaves coroutines in a situation where they're able to process concurrent tasks but have no standard way to access the results of those computations. Consider:
<?php $gen = function { $foo = yield myAsyncFoo(); // resume here when promised result returns $bar = yield myAsyncBar($foo); // resume here when promised result returns // Relying on the final yield as the "return" value here is ambiguous yield $bar + 42; };
In the above code we can assume the final yield is the “return” value but this is difficult to read and further it may not be the actual intent of the generator author. Userland code can currently work around this limitation to make such “returns” more explicit using the key => value yield form:
<?php $gen = function { $foo = yield myAsyncFoo(); $bar = yield myAsyncBar($foo); yield "return" => $bar + 42; };
The above example takes advantage of “meta data” about the yielded value in the form of the yielded key. While this approach can work to indicate intent and make code more readable it suffers from the failing that it is non-standard and concurrency frameworks are forced to fractal out their own domain-specific conventions for representing asynchronous coroutine execution results.
Generator return expressions as proposed here alleviate this problem as return statements have applicable semantics, known characteristics and low cognitive overhead:
<?php $gen = function { $foo = yield myAsyncFoo(); $bar = yield myAsyncBar($foo); return $bar + 42; // unambiguous execution "result" }; ?>
Generators currently utilize the `&` operator to indicate values will be yielded by-reference:
<?php function &gen_reference() { $value = 3; while ($value > 0) { yield $value; } } foreach (gen_reference() as &$number) { echo (--$number).'... '; } ?>
As function& is already in place to modify yield values we have no way to differentiate between by-reference yield values and by-reference return values. While it would be possible to use function& to mark returns as by-reference, this proposal's position is that no correlation exists between the “reference-ness” of yielded values and return values. Instead of introducing new tokens or syntax to allow by-reference returns this proposal always uses ZVAL_COPY on return zvals. In short: by-reference returns are not supported in generator return expressions.
Other popular dynamic languages currently support generator return expressions ...
Generator Returns in Python
Python 3.3 added the ability to return expressions in sub-generators and have these results returned to the parent generator.
def foo(): return 1 yield 2 # never reached def bar(): x = yield from foo() print(x) bar() # outputs 1
Generator Returns in Javascript
Javascript ES6 generators yield objects with a value property enumerating the yielded value and a boolean done property to indicate when iteration has completed:
function *foo(x) { var y = 2 * (yield (x + 1)); var z = yield (y / 3); return (x + y + z); }
None
PHP7
Existing generator semantics remain unmodified. Only new functionality is added to allow generators to return expressions and differentiate these expressions from standard yields via the new Generator::getReturn() method.
The proposed behavior lays the groundwork for future sub-generator functionality using yield from to break apart functional units into multiple generators. In such cases a sub-generator's return value is sent to the parent generator upon completed iteration. return expression capability is needed to implement this behavior in future PHP versions.
A brief example of how yield from might be implemented in the future using the return expressions proposed in the current RFC:
<?php function foo() { yield 1; return yield from bar(); } function bar() { yield 2; yield 3; return 4; } $baz = foo(); foreach ($baz as $element) { echo $element, "\n"; } echo $baz->getReturn(), "\n"; // 1 // 2 // 3 // 4
Voting begins 2015-03-09 and ends on 2015-03-16.
A 2/3 “Yes” vote is required to implement this proposal.
. Vote closed at 14:50 UTC 2015-03-17.
NOTE: the related Generator Delegation RFC depends on the acceptance of this proposal.
https://github.com/php/php-src/pull/1096
The linked patch was written by Nikita Popov and is intended as “final” excepting any changes that may arise during RFC discussion.
TBD