rfc:comprehensions

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revisionBoth sides next revision
rfc:comprehensions [2019/03/10 21:22] – created crellrfc:comprehensions [2019/03/10 21:36] – Fix wiki syntax issues crell
Line 1: Line 1:
 ====== PHP RFC: Generator comprehensions ====== ====== PHP RFC: Generator comprehensions ======
  
-* Version: 0.1 +  * Version: 0.1 
-* Date: 2019-03-10 +  * Date: 2019-03-10 
-* Author: Larry Garfield, larry@garfieldtech.com +  * Author: Larry Garfield, larry@garfieldtech.com 
-* Status: Draft +  * Status: Draft 
-* First published at: http://wiki.php.net/rfc/comprehensions+  * First published at: http://wiki.php.net/rfc/comprehensions
  
 ===== Introduction ===== ===== Introduction =====
Line 31: Line 31:
 </code> </code>
  
-In both cases, '%%'$gen%%'' is now a generator that will produce double the odd values of $list.  However, the first case uses 38 characters (with spaces) vs 94 characters (with spaces), and is easily compacted onto a single line as opposed to 7.+In both cases, ''%%'$gen%%'' is now a generator that will produce double the odd values of $list.  However, the first case uses 38 characters (with spaces) vs 94 characters (with spaces), and is easily compacted onto a single line as opposed to 7.
  
 ===== Proposal ===== ===== Proposal =====
Line 39: Line 39:
 The general form of a comprehension is: The general form of a comprehension is:
  
 +<code>
 '[' ('for' <iterable expression> 'as' $key '=>' $value ('if' <condition>)?)+ (yield <expression>)? ']' '[' ('for' <iterable expression> 'as' $key '=>' $value ('if' <condition>)?)+ (yield <expression>)? ']'
 +</code>
  
 That is, one or more for-if clauses in which the if statement is optional, optionally followed by a ''%%yield%%'' keyword and a single expression.  The entire expression is wrapped in square brackets. That is, one or more for-if clauses in which the if statement is optional, optionally followed by a ''%%yield%%'' keyword and a single expression.  The entire expression is wrapped in square brackets.
Line 75: Line 77:
 A comprehension is whitespace insensitive. It may be broken out to multiple lines if it aids readability with no semantic impact. A comprehension is whitespace insensitive. It may be broken out to multiple lines if it aids readability with no semantic impact.
  
-The following examples show a comprehension and the equivalent inline generator.  In each case the semantic behavior of '"%%$result%%'' is identical for both versions, but the comprehension syntax is shorter and easier to comprehend (pun intended).+The following examples show a comprehension and the equivalent inline generator.  In each case the semantic behavior of ''%%$result%%'' is identical for both versions, but the comprehension syntax is shorter and easier to comprehend (pun intended).
  
 <code php> <code php>
Line 123: Line 125:
 ]; ];
  
-// Whitespace is irrelevant, so breaking it out like this is totally fine if it aids readability.+// Whitespace is irrelevant, so breaking it  
 +// out like this is totally fine if it aids readability.
 $result = [for $table as $num => $row if $num %2 ==0  $result = [for $table as $num => $row if $num %2 ==0 
     for $row as $col => $value if $col >= 3     for $row as $col => $value if $col >= 3
Line 157: Line 160:
  
   - In context the for is unambiguously being used in a foreach-style way, thus there is no confusion.   - In context the for is unambiguously being used in a foreach-style way, thus there is no confusion.
-  - The '"%%for%%'' keyword is used by both Python and Javascript, the languages with the most similar existing syntax.  (See below.)+  - The ''%%for%%'' keyword is used by both Python and Javascript, the languages with the most similar existing syntax.  (See below.)
   - The point of comprehensions is a compact yet expressive syntax.  Given the above two points, using ''%%foreach%%'' would add nothing except four additional characters.   - The point of comprehensions is a compact yet expressive syntax.  Given the above two points, using ''%%foreach%%'' would add nothing except four additional characters.
  
Line 199: Line 202:
 The common default "is truth-y" use of ''%%array_filter()%%'' with no callback specified would be easily expressed as: The common default "is truth-y" use of ''%%array_filter()%%'' with no callback specified would be easily expressed as:
  
 +<code php>
 $result = [for $list as $x if $x]; $result = [for $list as $x if $x];
 +</code>
  
 ==== array_map() ==== ==== array_map() ====
Line 225: Line 230:
 $list = array_combine(range('a', 'j'), range(1, 10)); $list = array_combine(range('a', 'j'), range(1, 10));
  
-// array_map() itself cannot produce an array with dynamically defined keys so is omitted.+// array_map() itself cannot produce an array  
 +//with dynamically defined keys so is omitted.
  
 $result = (function() use ($list) { $result = (function() use ($list) {
Line 241: Line 247:
 $list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]; $list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
  
-// In practice you'd almost always just use a foreach() rather than this monstrosity, but I include it for completeness.+// In practice you'd almost always just use a  
 +// foreach() rather than this monstrosity,  
 +// but I include it for completeness.
 $result = array_filter(array_map(function ($x) { $result = array_filter(array_map(function ($x) {
   $x * 2;   $x * 2;
Line 276: Line 284:
 </code> </code>
  
-Because a generator implements Iterator, we can call '"%%current()%%'' on it to return the first/current item that would be produced.  The generator itself can be discarded with no further computation expense.+Because a generator implements Iterator, we can call ''%%current()%%'' on it to return the first/current item that would be produced.  The generator itself can be discarded with no further computation expense.
  
 ==== any() ==== ==== any() ====
Line 384: Line 392:
 $gen = [for $array as $x : int]; $gen = [for $array as $x : int];
 foreach ($gen as $val) { foreach ($gen as $val) {
-  // A TypeError would be thrown on the 3rd value, as it's not an int.+  // A TypeError would be thrown on the 3rd value,  
 +  // as it's not an int.
 } }
 </code> </code>
Line 395: Line 404:
 $run = [for $products as $p yield save($p)]; $run = [for $products as $p yield save($p)];
  
-// iterator_to_array() will result in an array of return values fro save_entity(). Depending on the data set this could be quite large, and must be allocated even if not saved.+// iterator_to_array() will result in an array of return  
 +// values fro save_entity(). Depending on the data  
 +// set this could be quite large, and must be allocated  
 +// even if not saved.
 iterator_to_array($run); iterator_to_array($run);
  
-// An empty foreach() will simply discard the return values, but is rather clumsy.+// An empty foreach() will simply discard the return values,  
 +// but is rather clumsy.
 foreach ($run as $val); foreach ($run as $val);
 </code> </code>
  
 It would be preferable to introduce a new function or language construct that can take an arbitrary generator and "run it out", discarding the results.  Such an operator would be a "nice to have" but is not a requirement of this RFC. It would be preferable to introduce a new function or language construct that can take an arbitrary generator and "run it out", discarding the results.  Such an operator would be a "nice to have" but is not a requirement of this RFC.
 +
 +===== Implementation =====
 +
 +Sara Golemon has written a proof of concept that demonstrates an approximate implementation:
 +
 +https://github.com/php/php-src/compare/master...sgolemon:list.comp
 +
 +It is currently incomplete as it lacks auto-capture and requires an explicit ''%%use%%'' statement.  Collaborators wishing to finish the implementation and/or assist with a terser syntax are most welcome.
  
 ===== Backward Incompatible Changes ===== ===== Backward Incompatible Changes =====
Line 411: Line 432:
  
 PHP 7.4 PHP 7.4
- 
  
  
rfc/comprehensions.txt · Last modified: 2019/04/05 01:10 by crell