PHP RFC: Generator Delegation
- Version: 0.2.0
- Date: 2015-03-01
- Author: Daniel Lowrey rdlowrey@php.net
- Contributors: Bob Weinand bwoebi@php.net
- Status: Implemented in 7.0
- First Published at: http://wiki.php.net/rfc/generator-delegation
Abstract
This RFC proposes new yield from <expr>
syntax allowing Generator
functions to delegate operations to Traversable
objects and arrays. The proposed syntax allows the factoring of yield
statements into smaller conceptual units in the same way that discrete class methods simplify object-oriented code. The proposal is conceptually related to and requires functionality proposed by the forerunning Generator Return Expressions RFC.
Proposal
The following new syntax is allowed in the body of generator functions:
yield from <expr>
In the above code <expr>
is any expression that evaluates to a Traversable
object or array. This
traversable is advanced until no longer valid, during which time it sends/receives values directly
to/from the delegating generator's caller. If the <expr>
traversable is a Generator
it is
considered a subgenerator whose eventual return value is returned to the delegating generator as the
result of the originating yield from
expression.
Terminology
- A “delegating generator” is a
Generator
in which theyield from <expr>
syntax appears.
- A “subgenerator” is a
Generator
used in the<expr>
portion of theyield from <expr>
syntax.
Prosaically
- Each value yielded by the traversable is passed directly to the delegating generator's caller.
- Each value sent to the delegating generator's
send()
method is passed to the subgenerator'ssend()
method. If the delegate traversable is not a generator any sent values are ignored as non-generator traversables have no capacity to receive such values.
- Exceptions thrown by traversable/subgenerator advancement are propagated up the chain to the delegating generator.
- Upon traversable completion
null
is returned to the delegating generator if the traversable is NOT a generator. If the traversable is a generator (subgenerator) its return value is sent to the delegating generator as the value of theyield from
expression.
Formally
The proposed syntax
$g = function() { return yield from <expr>; };
is equivalent to
$g = function() { $iter = <expr>; $isSubgenerator = $iter instanceof Generator; $received = null; $send = true; while ($iter->valid()) { if ($isSubgenerator) { $next = $send ? $iter->send($received) : $iter->throw($received); } else { $next = $iter->current(); $iter->next(); } try { $received = yield $next; $send = true; } catch (Exception $e) { if ($isSubgenerator) { $received = $e; $send = false; } else { throw $e; } } } return $isSubgenerator ? $iter->getReturn() : null; };
Rationale
A major impetus for generator delegation is refactoring and readability. At its core, this is the same guiding principle employed when returning values from discrete class methods. Imagine a class method from which no return value is possible. In such a scenario we could store the result in an instance property and subsequently retrieve it from the context of the calling code. However, this kind of superfluous state quickly becomes difficult to reason about.
Additionally, we recognize that functional contexts lack the additional stateful context in which
to store results. In the absence of the standard input-output paradigm there's no way to access the
eventual result of a generator's pausable computations. Of course, it is possible to work around this
suboptimal situation with references and closure use
binding but these indirect approaches are
instantly eliminated when Generator functions are allowed to return expressions. Return values minimize
cognitive overhead in such cases by allowing programmers to directly associate an individual operation with
its eventual result.
Generator delegation -- at its heart -- is nothing more than the application of standard factoring practices to allow the decomposition of complex operations into smaller cohesive units.
Use-Case: Factored Generator Computations
In this simple example we demonstrate the use of yield from
to factor out a more complex operation
into multiple discrete generators. Callers of myGeneratorFunction
do not care from whence the
individual yielded values came (nor should they). Instead, they simply iterate over the yielded values
awaiting the generator function's eventual return.
function myGeneratorFunction($foo) { // ... do some stuff with $foo ... $bar = yield from factoredComputation1($foo); // ... do some stuff with $bar ... $baz = yield from factoredComputation2($bar); return $baz; } function factoredComputation1($foo) { yield ...; // pseudo-code (something we factored out) yield ...; // pseudo-code (something we factored out) return 'zanzibar'; } function factoredComputation2($bar) { yield ...; // pseudo-code (something we factored out) yield ...; // pseudo-code (something we factored out) return 42; }
Use-Case: Generators as Lightweight Threads
The defining feature of Generator functions is their support for suspending execution for later resumption. This capability gives applications a mechanism to implement asynchronous and concurrent architectures even in a traditionally single-threaded language like PHP. With simple userland task scheduling systems interleaved generators become lightweight threads of execution for concurrent processing tasks.
In the absence of generator return values, though, applications face an environment where “background” tasks can be offloaded without a standardized way to return the eventual result. This is one reason why this proposal depends on the acceptance of the Generator Return Expressions RFC. The other reason return values are required stems from the previously discussed refactoring principle. Specifically: code using generators for threaded execution can benefit from subgenerators behaving like ordinary functions.
Using the proposed syntax an ordinary function foo
$baz = foo($bar);
can be transformed into a subgenerator delegation of the form
$baz = yield from foo($bar);
where foo
is a pausable generator. In this manner applications can create powerful userland concurrency
abstractions without the cognitive overhead often associated with threaded multitasking. In the
above example generator delegation allows the language to do the heavy lifting while the programmer
need only concern herself with the input, $bar
, and the eventual output, $baz
.
In short: generator delegation allows programmers to reason about the behaviour of the concurrent code
simply by thinking of foo()
as an ordinary function which can be suspended using a yield
statement.
NB: The actual implementation of coroutine task schedulers is outside the scope of this document. This RFC focuses only on the language-level machinery needed to make such tools more feasible in userland. It should be obvious that simply moving code into a generator function will not somehow make it magically concurrent.
Basic Examples
Delegating to another generator (subgenerator)
<?php function g1() { yield 2; yield 3; yield 4; } function g2() { yield 1; yield from g1(); yield 5; } $g = g2(); foreach ($g as $yielded) { var_dump($yielded); } /* int(1) int(2) int(3) int(4) int(5) */
Delegating to an array
<?php function g() { yield 1; yield from [2, 3, 4]; yield 5; } $g = g(); foreach ($g as $yielded) { var_dump($yielded); } /* int(1) int(2) int(3) int(4) int(5) */
Delegating to non-generator traversables
<?php function g() { yield 1; yield from new ArrayIterator([2, 3, 4]); yield 5; } $g = g(); foreach ($g as $yielded) { var_dump($yielded); } /* int(1) int(2) int(3) int(4) int(5) */
The yield from
expression value
<?php function g1() { yield 2; yield 3; return 42; } function g2() { yield 1; $g1result = yield from g1(); yield 4; return $g1result; } $g = g2(); foreach ($g as $yielded) { var_dump($yielded); } var_dump($g->getReturn()); /* int(1) int(2) int(3) int(4) int(42) */
Selected Implementation Details
The delegation implementation builds on the patch submitted as part of the Generator Return Expressions RFC. This implementation adds the following new parsing token:
%token T_YIELD_FROM “yield from (T_YIELD_FROM)”
The primary advantage of this approach is the addition of a readable and semantically meaningful syntax
without reserving a new from
keyword.
Subgenerator Keys
As Generator iteration is always associated with an accompanying key there exists the potential that a given delegation may return the same key multiple times from separate individual generators. This was deemed unproblematic for three primary reasons.
- Multiple occurrences of the same key can easily occur in the existing generator implementation should a function
yield
the same key more than once.
- If a caller derives semantic meaning from yielded keys the burden is placed on generator value producers to yield keys sensible for problem domain in which they exist. The burden here is not on the language to avoid duplicate keys.
- The
Traversable
interface represents neither a hash map nor a traditional contiguously indexed array and its implementations have no special requirement to expose data access to API consumers via index keys. As such, key recurrence exposes no risk for overwriting existing internal generator data.
Shared Subgenerator Behavior
PHP generator functions are implemented as stateful object instances. Although “sharing” valid generator instances does not present any immediately obvious use-cases, such behavior is still supported. Here we note some of the characteristics of shared generator functions.
If a “shared” subgenerator that has previously iterated to completion is passed in a yield from
expression its completed return value is immediately returned to the delegating generator. In code:
function subgenerator() { yield 1; return 42; } function delegator(Generator $shared) { return yield from $shared; } $shared = subgenerator(); while($shared->valid()) { $shared->next(); } $delegator = delegator($shared); foreach ($delegator as $value) { var_dump($value); } var_dump($delegator->getReturn()); /* int(42) // This is our only output because no values are yielded // from the already-completed shared subgenerator */
Manually advancing a shared subgenerator outside the context of the delegating generator will not result in an error. In code:
function subgenerator() { yield 1; yield 2; yield 3; yield 4; return 42; } function delegator(Generator $shared) { return yield from $shared; } $shared = subgenerator(); $shared->next(); $delegator = delegator($shared); var_dump($delegator->current()); $shared->next(); while($delegator->valid()) { var_dump($delegator->current()); $delegator->next(); } var_dump($delegator->getReturn()); /* int(2); int(3); int(4); int(42) */
Error States
There are two scenarios in which yield from
usage can result in an EngineException
:
- Using
yield from <expr>
where <expr> evaluates to a generator which previously terminated with an uncaught exception results in anEngineException
.
- Using
yield from <expr>
where <expr> evaluates to something that is neitherTraversable
nor an array throws anEngineException
.
Rejected Ideas
The original version of this RFC proposed a yield *
syntax. The yield *
syntax was rejected in favor of
yield from
on the basis that *
would break backwards compatibility. Additionally, the yield *
form was considered less readable than the current proposal.
Criticisms
It has been suggested during the discussion phase that a mechanism other than return
be used in
subgenerators to establish the value returned to delegating generators by yield from
expressions.
The following counter-arguments are provided to this criticism:
- Forms other than
return
would undermine the proposal's stated goal of conceptualizing subgenerators as suspendable functions. Implementing a different syntax would inhibit this understanding by fostering the idea of generators as being something other than “real” functions.
return
expressions have applicable semantics, known characteristics and low cognitive overhead.
- Other popular dynamic languages inhabiting a similar space to PHP implement generator delegation value resolution using the
return
syntax proposed here. Sharing common vernacular for similar features lowers cognitive barriers for developers coming to PHP from diverse backgrounds.
Other Languages
Other popular dynamic languages currently support variants of the proposed syntax ...
Python
Python 3.3 generators support the yield from
syntax:
>>> def accumulate(): ... tally = 0 ... while 1: ... next = yield ... if next is None: ... return tally ... tally += next ... >>> def gather_tallies(tallies): ... while 1: ... tally = yield from accumulate() ... tallies.append(tally) ... >>> tallies = [] >>> acc = gather_tallies(tallies) >>> next(acc) # Ensure the accumulator is ready to accept values >>> for i in range(4): ... acc.send(i) ... >>> acc.send(None) # Finish the first tally >>> for i in range(5): ... acc.send(i) ... >>> acc.send(None) # Finish the second tally >>> tallies [6, 10]
JavaScript
Javascript ES6 generators support the yield*
syntax:
function* g4() { yield* [1, 2, 3]; return "foo"; } var result; function* g5() { result = yield* g4(); } var iterator = g5(); console.log(iterator.next()); // { value: 1, done: false } console.log(iterator.next()); // { value: 2, done: false } console.log(iterator.next()); // { value: 3, done: false } console.log(iterator.next()); // { value: undefined, done: true }, // g4() returned { value: "foo", done: true } at this point console.log(result); // "foo"
Backward Incompatible Changes
None
Proposed PHP Version(s)
PHP7
Unaffected PHP Functionality
Existing generator semantics are unaffected.
Vote
A 2/3 “Yes” vote is required to implement this proposal. Voting will continue through March 29, 2015.
.
The success of this vote depends on the success of the accompanying Generator Return Expressions RFC. Should Generator Return Expressions be rejected the voting outcome of this RFC will be rendered moot.
Patches and Tests
The current patch is considered “final” and can be found here:
https://github.com/bwoebi/php-src/commits/coroutineDelegation
The patch was written by Bob Weinand and is based upon the implementation branch written by Nikita Popov for the Generator Return Expressions RFC. Extensive .phpt tests exist in the implementation branch and readers are encouraged both to compile with the proposed implementation and read the test cases to ascertain the full nature of the proposal.
Implementation
TBD
References
Changelog
- v0.2.0 Moved to
yield from
instead ofyield *
+ massive textual additions - v0.1.0 Initial proposal