rfc:generators

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
rfc:generators [2012/07/27 17:54] nikicrfc:generators [2017/09/22 13:28] (current) – external edit 127.0.0.1
Line 1: Line 1:
 ====== Request for Comments: Generators ====== ====== Request for Comments: Generators ======
   * Date: 2012-06-05   * Date: 2012-06-05
-  * Author: Niktia Popov <nikic@php.net> +  * Author: Nikita Popov <nikic@php.net> 
-  * Status: In Draft+  * Status: Implemented
  
 ===== Introduction ===== ===== Introduction =====
Line 189: Line 189:
          
     mixed send(mixed $value);     mixed send(mixed $value);
-    void close();+    mixed throw(Exception $exception);
 } }
 </code> </code>
Line 223: Line 223:
 Apart from the above the ''Generator'' methods behave as follows: Apart from the above the ''Generator'' methods behave as follows:
  
-  * ''rewind'': Generators are not rewindable, so this is just a no-op. (More in the "Rewinding a generator" section.)+  * ''rewind'': Throws an exception if the generator is currently after the first yield. (More in the "Rewinding a generator" section.)
   * ''valid'': Returns ''false'' if the generator has been closed, ''true'' otherwise. (More in the "Closing a generator" section.)   * ''valid'': Returns ''false'' if the generator has been closed, ''true'' otherwise. (More in the "Closing a generator" section.)
   * ''current'': Returns whatever was passed to ''yield'' or ''null'' if nothing was passed or the generator is already closed.   * ''current'': Returns whatever was passed to ''yield'' or ''null'' if nothing was passed or the generator is already closed.
Line 229: Line 229:
   * ''next'': Resumes the generator (unless the generator is already closed).   * ''next'': Resumes the generator (unless the generator is already closed).
   * ''send'': Sets the return value of the ''yield'' expression and resumes the generator (unless the generator is already closed). (More in the "Sending values" section.)   * ''send'': Sets the return value of the ''yield'' expression and resumes the generator (unless the generator is already closed). (More in the "Sending values" section.)
-  * ''close'': Closes the generator. (More in the "Closing a generator" section.)+  * ''throw'': Throws an exception at the current suspension point in the generator. (More in the "Throwing into the generator" section.)
  
-==== Yielding keys ====+==== Yield syntax ====
  
-The languages that currently implement generators don't have support for yielding keys (only values). This though is just a sideeffect +The newly introduced ''yield'' keyword (''T_YIELD'') is used both for sending and receiving values inside the generatorThere are three basic forms of the ''yield'' expression:
-as these languages don't support keys in iterators in general+
  
-In PHP on the other hand keys are explicitly part of the iteration process and it thus does not make sense to not add +  * ''%%yield $key => $value%%'': Yields the value ''$value'' with key ''$key''. 
-key-yielding support. The syntax could be analogous to that of ''foreach'' loops and ''array'' declarations:+  * ''yield $value'': Yields the value ''$value'' with an auto-incrementing integer key
 +  * ''yield'': Yields the value ''null'' with an auto-incrementing integer key. 
 + 
 +The return value of the ''yield'' expression is whatever was sent to the generator using ''send()''. If nothing was sent (e.g. during ''foreach'' iteration) ''null'' is returned. 
 + 
 +To avoid ambiguities the first two ''yield'' expression types have to be surrounded by parenthesis when used in expression-context. Some examples when parentheses are necessary and when they aren't:
  
 <code php> <code php>
 +// these three are statements, so they don't need parenthesis
 yield $key => $value; yield $key => $value;
 +yield $value;
 +yield;
 +
 +// these are expressions, so they require parenthesis
 +$data = (yield $key => $value);
 +$data = (yield $value);
 +
 +// to avoid strange (yield) syntax the parenthesis are not required here
 +$data = yield;
 </code> </code>
-     + 
-The problem with this syntax that it would be ambiguous in array declarations and nested ''yield'' expressions:+If ''yield'' is used inside a language construct that already has native parentheses, then they don't have to be duplicated:
  
 <code php> <code php>
-array( +call(yield $value); 
-    yield $key => $value +// instead of 
-+call((yield $value));
-// could be +
-array( +
-    (yield $key) => $value +
-) +
-// or +
-array( +
-    (yield $key => $value) +
-)+
  
-yield yield $=> $b; +if (yield $value) { ... } 
-// could be +// instead of 
-yield (yield $a) => $b+if ((yield $value)) { ... } 
 +</code> 
 + 
 +The only exception is the ''array()'' structure. Not requiring parenthesis would be ambiguous here: 
 + 
 +<code php> 
 +array(yield $key => $value) 
 +// can be either 
 +array((yield $key) => $value)
 // or // or
-yield (yield $=> $b)+array((yield $key => $value))
 </code> </code>
  
-Even though this is an absolute edge-case, the grammar still shouldn't be ambiguous in this case. It's not clear how this can +Python also has parentheses requirements for expression-use of ''yield''The only difference is that Python also requires parentheses for a value-less ''yield'' (because the language does not use semicolons). 
-be solvedIn the current implementation the ''yield $key => $value'' syntax is simply implemented as a statement instead of + 
-an expression. This obviously is a rather bad solution to the problem and it would be preferable to find some other way to deal +See also the [[#alternative_yield_syntax_considerations|"Alternative yield syntax considerations" section]]. 
-with it.+ 
 +==== Yielding keys ==== 
 + 
 +The languages that currently implement generators don't have support for yielding keys (only values). This though is just side-effect 
 +as these languages don't support keys in iterators in general.  
 + 
 +In PHP on the other hand keys are explicitly part of the iteration process and it thus does not make sense to not add 
 +key-yielding supportThe syntax could be analogous to that of ''foreach'' loops and ''array'' declarations: 
 + 
 +<code php> 
 +yield $key => $value; 
 +</code>
  
 Furthermore generators need to generate keys even if no key was explicitly yielded. In this case it seems reasonable to behave Furthermore generators need to generate keys even if no key was explicitly yielded. In this case it seems reasonable to behave
Line 338: Line 363:
  
 Only generators specifying the ''&'' modifier can be iterated by ref. If you try to iterate a non-ref generator by-ref an ''E_ERROR'' is thrown. Only generators specifying the ''&'' modifier can be iterated by ref. If you try to iterate a non-ref generator by-ref an ''E_ERROR'' is thrown.
 +
 +==== Sending values ====
 +
 +Values can be sent into a generator using the ''send()'' method. ''send($value)'' will set ''$value'' as the return value
 +of the current ''yield'' expression and resume the generator. When the generator hits another ''yield'' expression the yielded value will be
 +the return value of ''send()''. This is just a convenience feature to save an additional call to ''current()''.
 +
 +Values are always sent by-value. The reference modifier ''&'' only affects yielded values, not the ones sent back to the coroutine.
 +
 +A simple example of sending values: Two (interchangeable) logging implementations:
 +
 +<code php>
 +function echoLogger() {
 +    while (true) {
 +        echo 'Log: ' . yield . "\n";
 +    }
 +}
 +
 +function fileLogger($fileName) {
 +    $fileHandle = fopen($fileName, 'a');
 +    while (true) {
 +        fwrite($fileHandle, yield . "\n");
 +    }
 +}
 +
 +$logger = echoLogger();
 +// or
 +$logger = fileLogger(__DIR__ . '/log');
 +
 +$logger->send('Foo');
 +$logger->send('Bar');
 +</code>
 +
 +==== Throwing into the generator ====
 +
 +Exceptions can be thrown into the generator using the ''Generator::throw()'' method. This will throw an exception in the generator's execution
 +context and then resume the generator. It is roughly equivalent to replacing the current ''yield'' expression with a ''throw'' statement and
 +resuming then. If the generator is already closed the exception will be thrown in the callers context instead (which is equivalent to replacing
 +the ''throw()'' call with a ''throw'' statement). The ''throw()'' method will return the next yielded value (if the exception is caught and no
 +other exception is thrown).
 +
 +An example of the functionality:
 +
 +<code php>
 +function gen() {
 +    echo "Foo\n";
 +    try {
 +        yield;
 +    } catch (Exception $e) {
 +        echo "Exception: {$e->getMessage()}\n";
 +    }
 +    echo "Bar\n";
 +}
 +
 +$gen = gen();
 +$gen->rewind();                     // echos "Foo"
 +$gen->throw(new Exception('Test')); // echos "Exception: Test"
 +                                    // and "Bar"
 +</code>
  
 ==== Rewinding a generator ==== ==== Rewinding a generator ====
Line 357: Line 441:
 Here rewinding would simply result in an empty iterator as the result set is already depleted. Here rewinding would simply result in an empty iterator as the result set is already depleted.
  
-One solution thus could be to allow explicitly marking generators to be rewindableE.g. one could add a ''rewindable'' function, +For the above reasons generators will not support rewindingThe ''rewind'' method will throw an exceptionunless the generator is currently before or at the first yield. This results in the following behavior:
-which makes a generator rewindable:+
  
 <code php> <code php>
-$gen = rewindable(gen()); +$gen = createSomeGenerator();
-</code> +
-     +
-This function is actually already implementable in userland code (see "Cloning a generator" section.)+
  
-==== Cloning a generator ====+// the rewind() call foreach is doing here is okay, because 
 +// the generator is before the first yield 
 +foreach ($gen as $val) { ... }
  
-Generators can be cloned, thus leaving two independent ''Generator'' objects with the same state. This behavior can for example be +// the rewind() call of a second foreach loop on the other hand 
-used to create the aforementioned ''rewindable'' function: +// throws an exception 
- +foreach ($gen as $val) { ... }
-<code php> +
-class RewindableGenerator implements Iterator { +
-    protected $original; +
-    protected $current; +
-     +
-    public function __construct(Generator $generator) { +
-        $this->original = $generator; +
-        $this->current = null; +
-    } +
-     +
-    public function rewind() { +
-        if ($this->current) { $this->current->close(); } +
-        $this->current = clone $this->original; +
-        $this->current->rewind(); +
-    } +
-     +
-    public function valid() { +
-        if (!$this->current) { $this->current = clone $this->original;+
-        return $this->current->valid(); +
-    } +
-     +
-    public function current() { +
-        if (!$this->current) { $this->current = clone $this->original;+
-        return $this->current->current(); +
-    } +
-     +
-    public function key() { +
-        if (!$this->current) { $this->current = clone $this->original;+
-        return $this->current->key(); +
-    } +
-     +
-    public function next() { +
-        if (!$this->current) { $this->current = clone $this->original;+
-        $this->current->next(); +
-    } +
-     +
-    public function send($value) { +
-        if (!$this->current) { $this->current = clone $this->original;+
-        return $this->current->send($value); +
-    } +
-     +
-    public function close() { +
-        $this->original->close(); +
-        if ($this->current) { +
-            $this->current->close(); +
-        } +
-    } +
-+
- +
-function rewindable(Generator $generator) { +
-    return new RewindableGenerator($generator); +
-}+
 </code> </code>
  
-It can be then used as follows:+So basically calling ''rewind'' is only allowed if it wouldn't do anything (because the generator is already at its initial state). After that an exception is thrown, so accidentally reused generators are easy to find.
  
-<code php> +==== Cloning a generator ====
-function xrange($start, $end, $step 1) { +
-    for ($i $start; $i <$end; $i +$step) { +
-        yield $i; +
-    } +
-}+
  
-$range = rewindable(xrange(0, 5)); +Generators cannot be cloned.
-foreach ($range as $i) { +
-    echo $i, "\n"; +
-+
-foreach ($range as $i) { +
-    echo $i, "\n"; +
-+
-</code>+
  
-This will correctly output the 0..5 range twice.+Support for cloning was included in the initial version, but removed in PHP 5.5 Beta 3 due to implementational difficulties, unclear semantics and no particularly convincing use cases.
  
 ==== Closing a generator ==== ==== Closing a generator ====
Line 450: Line 468:
 ''valid'' will return ''false'' and both ''current'' and ''key'' will return ''null''. ''valid'' will return ''false'' and both ''current'' and ''key'' will return ''null''.
  
-A generator can be closed in three ways:+A generator can be closed in two ways:
  
-  * Explicitly calling the ''Generator::close'' method. This can be useful if you want to free used memory after you don't need the generator anymore.+  * Reaching a ''return'' statement (or the end of the function) in a generator or throwing an exception from it (without catching it inside the generator).
   * Removing all references to the generator object. In this case the generator will be closed as part of the garbage collection process.   * Removing all references to the generator object. In this case the generator will be closed as part of the garbage collection process.
-  * Reaching a ''return'' statement (or the end of the function) in a generator or throwing an exception from it (without catching it inside the generator).+ 
 +If the generator contains (relevant) ''finally'' blocks those will be run. If the generator is force-closed (i.e. by removing all references) then it is not 
 +allowed to use ''yield'' in the ''finally'' clause (a fatal error will be thrown). In all other cases ''yield'' is allowed in ''finally'' blocks.
  
 The following resources are destructed while closing a generator: The following resources are destructed while closing a generator:
Line 470: Line 490:
 this problem: https://bugs.php.net/bug.php?id=62210. If that bug could be fixed for exceptions, then it would also be fixed for generators. this problem: https://bugs.php.net/bug.php?id=62210. If that bug could be fixed for exceptions, then it would also be fixed for generators.
  
-A generator cannot be closed while it is running (e.g. if ''$generator->closed()'' is called within the generator function). In this case and ''E_WARNING'' +==== Error conditions ====
-is thrown and the call is ignored.+
  
-In Python generators are closed by throwing a ''GeneratorExit'' exception into them. This is *not done* in the PHP implementation.+This is a list of generators-related error conditions:
  
-The exception gives the generator function a chance to clean up any resources it uses:+  * Using ''yield'' outside a function: ''E_COMPILE_ERROR'' 
 +  * Using ''return'' with value inside a generator: ''E_COMPILE_ERROR'' 
 +  * Manual construction of ''Generator'' class: ''E_RECOVERABLE_ERROR'' (analogous to ''Closure'' behavior) 
 +  * Yielding a key that isn't an integer or a key''E_ERROR'' (this is just a placeholder until Etienne's arbitrary-keys patch lands) 
 +  * Trying to iterate a non-ref generator by-ref: ''Exception'' 
 +  * Trying to traverse an already closed generator: ''Exception'' 
 +  * Trying to rewind a generator after the first yield: ''Exception'' 
 +  * Yielding a temp/const value by-ref: ''E_NOTICE'' (analogous to ''return'' behavior) 
 +  * Yielding a string offset by-ref: ''E_ERROR'' (analogous to ''return'' behavior) 
 +  * Yielding a by-val function return value by-ref: ''E_NOTICE'' (analogous to ''return'' behavior)
  
-<code php> +This list might not be exhaustive.
-function gen() { +
-    $lock = getLock(); +
-    try { +
-        // do some stuff +
-    } catch (GeneratorExitException $e) { +
-        releaseLock($lock); +
-        throw $e; +
-    } +
-    releaseLock($lock); +
-+
-</code> +
- +
-As you can see the exception allows the generator to clean up. Also note that the exception has to be rethrown. This way it is ensured that +
-the generator does not simply resume execution, but really does terminate. +
- +
-As PHP does not support ''finally'' blocks which have strict must-run semantics and there are other simple ways to enforce resource cleanup +
-(e.g. destructors) this behavior is not implemented in PHP. +
- +
-===== Patch ===== +
- +
-A working, but not yet complete implementation can be found at https://github.com/nikic/php-src/tree/addGeneratorsSupport.+
  
 ===== Performance ===== ===== Performance =====
Line 516: Line 523:
 The tests were run on a Ubuntu VM, so I'm not exactly sure how representative they are. The tests were run on a Ubuntu VM, so I'm not exactly sure how representative they are.
  
-===== Why not just use callback functions? =====+===== Some points from the discussion ===== 
 + 
 +==== Why not just use callback functions? ====
  
 A question that has come up a few times during discussion: Why not use callback functions, instead of generators? For example the above ''getLinesFromFile'' function could A question that has come up a few times during discussion: Why not use callback functions, instead of generators? For example the above ''getLinesFromFile'' function could
Line 588: Line 597:
 generators solve this problem elegantly, because they maintain state implicitly, in the execution state. generators solve this problem elegantly, because they maintain state implicitly, in the execution state.
  
-===== TODO =====+==== Alternative yield syntax considerations ==== 
 + 
 +Andrew proposed to use a function-like syntax for ''yield'' instead of the keyword notation. The three ''yield'' variants would then look as follows: 
 + 
 +  * ''yield()'' 
 +  * ''yield($value)'' 
 +  * ''%%yield($key => $value)%%'' 
 + 
 +The main advantage of this syntax is that it would avoid the strange parentheses requirements for the ''yield $value'' syntax. 
 + 
 +One of the main issues with the pseudo-function syntax is that it makes the semantics of ''yield'' less clear. Currently the ''yield'' syntax looks very similar to the ''return'' 
 +syntax. Both are very similar in a function, so it is desirable to keep them similar in syntax too. 
 + 
 +Generally PHP uses the ''keyword $expr'' syntax instead of the ''keyword($expr)'' syntax in all places where the statement-use is more common than the expression-use. E.g. 
 +''include $file;'' is usually used as a statement and only very rarely as an expression. ''isset($var)'' on the other hand is normally used as an expression (a statement use 
 +wouldn't make any sense, actually). 
 + 
 +As ''yield'' will be used as a statement in the vast majority of cases the ''yield $expr'' syntax thus seems more appropriate. Furthermore the most common expression-use of 
 +''yield'' is value-less, in which case the parentheses requirements don't apply (i.e. you can write just ''$data = yield;''). 
 + 
 +So the function-like ''yield($value)'' syntax would optimize a very rare use case (namely ''$recv = yield($send);''), at the same time making the common use cases less clear. 
 + 
 +===== Patch ===== 
 + 
 +The current implementation can be found in this branch: https://github.com/nikic/php-src/tree/addGeneratorsSupport. 
 + 
 +I also created a PR so that the diff can be viewed more easily: https://github.com/php/php-src/pull/177 
 + 
 +===== Vote =====
  
-  * Decide on whether to implement the ''Generator::throw'' method +<doodle title="Should generators be merged into master?" auth="nikic" voteType="single" closed="true"> 
-  Implement ''yield*'' expression and generator return values+   * Yes 
 +   No 
 +</doodle>
  
 ===== Further resources ===== ===== Further resources =====
rfc/generators.1343411655.txt.gz · Last modified: 2017/09/22 13:28 (external edit)