This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
rfc:auto-capture-closure [2022/05/27 09:12]
lbarnaud No arrow function changes
rfc:auto-capture-closure [2022/07/02 13:12] (current)
imsop update "Vote" heading
Line 1: Line 1:
-====== PHP RFC: Auto-capturing multi-statement closures ======+====== PHP RFC: Short Closures 2.0 ======
   * Version: 2.0   * Version: 2.0
   * Date: 2022-05-25   * Date: 2022-05-25
Line 5: Line 5:
   * Author: Larry Garfield (larry@garfieldtech.com)   * Author: Larry Garfield (larry@garfieldtech.com)
   * Author: Arnaud Le Blanc (arnaud.lb@gmail.com)   * Author: Arnaud Le Blanc (arnaud.lb@gmail.com)
-  * Status: In Discussion+  * Status: In Voting
   * First Published at: http://wiki.php.net/rfc/auto-capture-closure   * First Published at: http://wiki.php.net/rfc/auto-capture-closure
 ===== Introduction ===== ===== Introduction =====
-Closures (also known as lambdas or anonymous functions), have become increasingly powerful and useful in PHP in recent versions In their current form they have two versions, long and short.  Unfortunately, these two syntaxes have different, mutually-incompatible benefits.  This RFC proposes a syntax for closures that combines the benefits of both for those situations where that is warranted.+Anonymous functions in PHP can be verbose, in part due to the need to manually import used variables. This makes code using simple closures hard to read and understand.
-<code php> +[[rfc:arrow_functions_v2|Arrow Functions]] were introduced in PHP 7.4 as an alternative. However, the single-expression limitation can lead to complex one-liners, or makes Arrow Functions unfit in many use-cases that would benefit from a more concise syntax.
-// As of 8.1:+
-$y = 1;+This RFC proposes an extension of the Arrow Function syntax supporting multiple statements:
-$fn1 = fn($x) => $x + $y; // auto-capture + single expression+<code php> 
 +$guests array_filter($users, fn ($user
 +    $guest $repository->findByUserId($user->id); 
 +    return $guest !== null && in_array($guest->id, $guestsIds); 
-$fn2 function ($x) use ($y): int { // manual-capture + statement list +===== Proposal =====
-   // ...+
-   return $x + $y+Short Closures extend Arrow Functions by allowing multiple statements enclosed in ''{'' and ''}'' instead of a single expression: 
 +<code php> 
 +fn (parameter_list) { 
 +    statement_list
 </code> </code>
-The proposed syntax combines the auto-capture and multi-line capabilities into single syntax:+The ''statement_list'' is a sequence of statements separated by semicolons. A ''return'' statement must be used to return a value. 
 +The syntax and behavior otherwise match those of Arrow Functions. 
 +==== Auto capture by-value ==== 
 +Like Arrow Functions, Short Closures use auto capture by-value. When variable used in the Short Closure is defined in the parent scope it will be automatically captured by-value. In the following example the functions $fn1, $fn2, and $fn3 behave the same:
 <code php> <code php>
-$fn3 = fn ($x): int { // auto-capture statement list +$y = 1; 
-    // ...+ 
 +$fn1 = fn ($x) => $x $y;
 +$fn2 = fn ($x) {
 +    return $x + $y;
 +$fn3 = function ($x) use ($y) {
     return $x + $y;     return $x + $y;
 }; };
 </code> </code>
-===== Proposal =====+==== No explicit capture ====
-==== Background ====+Explicit capture is not included in the new syntax.  It remains available only via the existing long-closure syntax, which only captures explicitly.  Earlier versions of this proposal included mixing auto-capture and explicit capture, but it was determined that was too confusing.
-As of PHP 8.1, the following syntaxes around functions have the following meaning:+==== Syntax ==== 
 +The signature accepts the same syntax as that of Arrow Functions:
 <code php> <code php>
 +fn () { }
 +fn ($a, $b) { }
 +fn ($a, ...$args) { }   // Variadic parameter
 +fn (int $a): string { } // Type hints
 +fn ($a = 42) { }        // Parameter default value
 +fn &($a) { }            // Return by-reference
 +fn (&$a) { }            // Pass by-reference
-// A namedglobally available function. +The signature must be followed by ''{'', a statement list, and ''}'':
-// No variables are auto-captured from the environment. +
-// The body is a statement list, with possibly a return statement. +
-function foo($a, $b, $c)int { +
-  return $a * $b * $c; +
-// An anonymous, locally available function. +<code php> 
-// Variables are explicitly captured lexically.  +fn () { return 1; } 
-// The body is a statement list, with possibly a return statement. +fn () { print 1; } 
-$foo = function ($a, $b) use ($c) { +fn () { 
-  return $a * $b * $c; +    $tmp = $a $b
-}; +    return $tmp; 
- +}
-// An anonymous, locally available function. +
-// Variables are auto-captured lexically. +
-// The body is a single-expression, whose value is returned. +
-$foo = fn($a, $b): int =$a $b $c;+
 </code> </code>
-That is, function may be named or local/anonymous, auto-capture or not, and a statement list or single expression That means there are 8 possible combinations of properties, of which only three are currently supported.+Note that Short Closures with multi-statement body do not have an implicit return valueA ''return'' statement must be used to return a value.
-The declined [[rfc:short-functions|Short Functions]] RFC sought to add one additional combination: named, no-capture, single-expression.+The syntax choice here is consistent with other language constructs:
-This RFC seeks to add different combination: anonymous, auto-capture, statement list.+  * ''{ ... }'' denotes statement listwithout implicit return value. 
 +  * Conversely, the ''=>'' token is followed by an expression in all circumstances.  (Arrow Functions, arrays, and ''match()''.) 
 +  * The ''fn'' keyword indicates a function that will auto-capture variablesby-value. 
 +  * The ''function'' keyword indicates a function that has no auto-capture.
-The remaining potential combinations would be:+These rules are easily recognizable and learnable by developers.
-  * named function, auto-capture, statement list - This is of little use in practice as there is nothing to auto-capture, except potentially global variables. +===== Why extend Arrow Functions? =====
-  * named function, auto-capture, expression - Ibid. +
-  * anonymous function, manual-capture, expression - While this form would be possible to add, its use cases are limited.  The existing short-closure syntax is superior in nearly all cases.+
-None of these additional variants are included in this RFC.+Arrow Functions were added as an alternative to Anonymous Functions. The latter can be quite verbose, even when they only perform a simple operation. This is due to a large amount of syntactic boilerplate that is needed to manually import used variables with the ''use'' keyword.
-==== Auto-capture multi-statement closures ====+While Arrow Functions solve this problem to some extent, the one-expression limit can lead to one-liners with non ideal readability, or can make them unfit for some use-cases. There are ample cases where breaking an expression to 2-3 statements is required or would improve the legibility of the code.
-Specificallythis RFC adds the following syntax:+As an examplewriting the following code snippet with a single-expression Arrow Function would degrade legibility, but writing it as an Anonymous Function would be cumbersome:
 <code php> <code php>
-// An anonymous, locally available function. +$guests array_filter($users, fn ($user) { 
-// Variables are auto-captured lexically. +    $guest = $repository->findByUserId($user->id)
-// The body is a statement list, with possibly a return statement; +    return $guest !== null && in_array($guest->id, $guestsIds)
-$1; +});
-$foo = fn($a, $b):int +
-  $val = $a * $b+
-  return $val * $c+
 </code> </code>
-The syntax choice here leads to the following consistent syntactic meanings:+===== Discussion on auto-capture =====
-  * The ''=>'' symbol always means "evaluates to the expression on the right," in all circumstances.  (Named functions, anonymous functions, arrays, and ''match()''.) +Auto capture was first introduced by Arrow Functions.
-  * ''{ ... }'' denotes a statement list, potentially ending in a ''return''+
-  * The ''function'' keyword indicates a function that has no auto-capture+
-  * The ''fn'' keyword indicates a function that will auto-capture variables, by value. +
-  * A function with a name is declared globally at compile time.  A function without a name is declared locally as a closure at runtime.+
-These rules are easily recognizable and learnable by developers.+In the past, there had been reticence about auto-capture that has kept it out of evolutions in closures.  Mostly that has boiled down to a few concerns: Implementation difficulties, performance, and debugability.
-The ''use'' keyword may still be used with auto-capturing closures if desired, to support capturing by reference or to capture variables to use in a variable-variable expression+Implementation difficulties arise from by-reference or by-variable semantics, especially when supporting dynamic means of accessing variables like variable-variables, compact(), or eval(). In this proposal and in Arrow Functions, the implementation difficulties are eliminated by using by-value semantics and requiring dynamically accessed variables to be captured explicitly.
-<code php> +As noted in the benchmarks section, the implementation offered here has effectively no performance impact either way. 
-$c = 1; + 
-$foo = fn($a, $b) use (&$c):int { +In the majority of cases where closures are used in practice, the code involved is short enough that debugging is not hampered by automatic capture.  They are usually only few lines longeasily small enough to fit into a developer's short term memory while reading it.  What variables are captured is visually self-evident.  
-  $val = $a * $b; + 
-  return $val * $c; +Potential confusing behavior is further mitigated by PHP'(correctuse of by-value capture, which minimizes the potential for inadvertent confusing changes to values from closures. 
-}; + 
-</code>+Furthermore, as noted PHP is unusual in requiring explicit capture.  The only other language that does so is C++.  Most languages get along fine without that extra step. 
 +For those few cases in which, for whatever reason, the developer is concerned about auto-capture reducing debugability or about accidental capture, the existing explicit-only syntax remains valid and unchanged.
-In practice, we anticipate the ''use'' keyword to be rarely used.+==== Using variables from the parent block ====
-==== Explicit capture ====+Using variables from the parent block is not unusual in PHP. We do it all the time in loops.
-The proposed syntax supports explicit capture with the ''use'' keyword. Auto-capture and explicit capture can coexist in the same function declaration.+In the following example, the loop uses three variables from the parent block. We have learned to recognize that what follows a ''foreach'', ''for'', or ''while'' keyword can do that.
 <code php> <code php>
-$1+$guests []
-fn () use ($a, &$b) { +foreach ($users as $user) { 
-    return $a + $b + $c// $a is explicitly captured by value +    $guest = $repository->findByUserId($user->id)
-                         // $b is explicitly captured by reference +    if ($guest !== null && in_array($guest->id, $guestsIds)) { 
-                         // $c is auto-captured by value+        $guests[] = $guest; 
 +    }
 } }
 </code> </code>
-This allows auto-capturing multi-statement closures to match long closures in functionality. Without thisit could be necessary to switch back and forth between the auto-capturing syntax and the long closure syntax when capturing by reference is needed.+In the following example, the function uses two variables from the parent block, which should not be more surprising than with a loop once we have learned that what follows a ''fn'' keyword can do that, like we did with ''foreach''.
-We expect that explicitly capturing by value will be rare in practice.+<code php> 
 +$guests = array_filter($users, fn ($user) { 
 +    $guest = $repository->findByUserId($user->id); 
 +    return $guest !== null && in_array($guest->id, $guestsIds); 
-==== Auto-capture semantics ====+However the comparison stops here. These two examples do not behave equally with regard to side effects: Variable assignments to the ''$guest'' and ''$user'' variables in the loop can be observed after the loop, but the same is not true with the Short Closure.
-The auto-capture semantics presented here are designed to be intuitive and have negligible performance impact.+==== Capture is by-value, no unintended side-effects ====
-Auto-capturing multi-statement closures can access all variables in their declaring scope with the variable access syntax (e.g. ''$var''):+It is important to note that the default capture mode in Anonymous Functions, Arrow Functions, and Short Closures is by-value. This purposefully differs from the semantics commonly found in other programming languages. 
 +A by-value capture means that it is not possible to modify any variables from the outer scope
 <code php> <code php>
 $a = 1; $a = 1;
-$b = 2; 
 $f = fn () { $f = fn () {
-    print $a + $b;+    $a++;          // Has no effect outside of the function 
 +    $tmp = $a + 1; // Has no effect outside of the function 
 +    return $tmp;
 }; };
-$f(); // prints "3"+print $a; // prints "1" 
 +print $a; // prints "1(again)
 </code> </code>
-Accessed variables are bound //by value// at the time of the function declaration:+Conversely, the outer scope cannot modify variables in the function: 
 <code php> <code php>
Line 160: Line 186:
 $f();     // prints "1" (again) $f();     // prints "1" (again)
 </code> </code>
 +Because variables are bound by-value, the confusing behaviors often associated with closures do not exist. As an example, the following code snippet demonstrates such a behavior in JavaScript:
 +<code javascript>
 +// JavaScript
 +var fns = [];
 +for (var i = 0; i < 3; i++) {
 +    fns.push(function() {
 +        console.log(i);
 +    });
 +for (var k in fns) {
 +    var fn = fns[k];
 +    fn(); // Prints "3", "3", "3"
 +In PHP the behavior is intuitive and less confusing:
 <code php> <code php>
-$= 1; +// PHP 
-$= fn () { +$fns []; 
-    $a++;+for ($i = 0; $i < 3; $i++) { 
 +    $fns[] = fn () { 
 +        print $i; 
 +    }; 
 +foreach ($fns as $fn) { 
 +    $fn(); // Prints "0", "1", "2" 
 +In JavaScript the same output can be obtained by declaring ''i'' with the ''let'' keyword. Using the ''var'' keyword, and loops, is largely discouraged. However ''i'' is still captured by-variable (not to be confused with by-value), so the anonymous functions can still modify the value of ''i''. A different behavior can be obtained with the ''const'' keyword. 
 +In PHP, the variable is captured by-value, thus entirely avoiding the confusion. 
 +Of course, functions can have side-effects when accessing mutable values such as objects or resources. The following example demonstrates this: 
 +<code php> 
 +$d = new DateTime(); 
 +$fn1 = fn () { 
 +    $d->modify('1 day')// Has an effect on the object bound to $d
 }; };
-print $a; // prints "1" +$fn2 = function () use ($d) { 
-$f(); +    $d->modify('+ 1 day'); // Has an effect on the object bound to $d 
-print $a; // prints "1" (again)+}; 
 +$fn3 = function (DateTime $d{ 
 +    $d->modify('+ 1 day'); // Has an effect on the object bound to $d 
 </code> </code>
-Because variables are bound by value, the potential for "spooky action at a distance" is minimized.  Captured scalar values changed inside a closure will not "leak" to other parts of the code.  Objects captured inside a closure may have changes that propagate, depending on the object, but that is no different than objects used in any other function or object, and developers are used to being aware of that potential.+===== Auto-capture semantics =====
-This is the behavior of long closures with explicit capture and of arrow functions.+The RFC inherits the auto-capture semantics of Arrow FunctionsThese semantics can be stated as follows: 
-For performance reasons, only the variables that are directly accessed with the variable access syntax in the closure are auto-captured. This excludes dynamic means of accessing variables, such as the variable-variable syntaxThis matches arrow functions.+> Short Closures can access a snapshot of the variable bindings of their declaring scope by accessing variables literally. The snapshot is taken when the function is declaredAssignments to variables do not have an effect on the declaring scope.
-Additionally, variables that are always assigned by the closure before being read are not captured, since this is not needed. This differs from arrow functions (which rarely assign to a value, so that situation does not come up).+This can also be stated as follows:
-We can express these semantics more succinctly like this: Auto-capturing multi-statement closures capture at least all the variables that are directly accessed by the closure.+> Short Closures can read variables of their declaring scope by accessing variables literally. The values of these variables are the ones that were bound to them at function declaration. Assignments to variables do not have an effect on the declaring scope.
-The "at least" part has only marginal effect aside from performance, and is not relevant for most programsWhether a variable is captured or not may only be observed through reflection, or through object destructors (because capturing may impact the exact moment at which they are called).+This is implemented by binding the value of the declaring scope variables to local variables in the functionThis is referred to as //capture// in this RFC.
-==== Implementation details ====+This RFC leaves unspecified which variables are captured, as long as these semantics are maintained. 
 +==== Optimization ====
-Auto-capturing all variables directly accessed by closure body will commonly capture too many variables. In the following example, the variable ''$tmp'' would be captured although this is not necessary because it is always assigned before being read (remember that variable assignments do not have an effect outside of the closure).+A naive approach would capture //all// the variables that are accessed literally by the closure. This will commonly capture variables that are not necessary to maintain these semantics. In the following example, the variable ''$tmp'' would be captured although this is not necessary because it is always assigned before being read (remember that variable assignments do not have an effect outside of the closure).
 <code php> <code php>
 $tmp = 5; $tmp = 5;
-fn() {+fn () {
     $tmp = foo();     $tmp = foo();
     bar($tmp);     bar($tmp);
Line 197: Line 266:
 </code> </code>
-A naive capture mechanism would unnecessarily capture ''$tmp'', resulting in wasted memory usage.+This approach would result in a waste of memory or CPU usage.
-Capture analysis, the process of choosing which variables to captureis based on [[https://en.wikipedia.org/wiki/Live-variable_analysis|live-variable analysis]]. This reuses the Optimizer's existing implementation of live-variable analysis. We use this to conservatively find the variables for which path exists in the function'code in which the variable may be read before being assigned. These variables are the minimum set we need to capture.+The implementation proposed in this RFC prevents this by attempting to capture the smallest possible set of variables necessary to maintain these semantics. In practiceShort Closures end up capturing the same set of variables that Anonymous Functions with manually curated capture list would have captured. This was observed on the PHPStan code base by converting all Anonymous Functions to Short Closures, and looking at which variables were automatically captured after that.
-In practiceauto-capturing multi-statement closures end up capturing the same set of variables as long closure with explicit capture would have captured. This was verified on the PHPStan code base by converting all closures to auto-capturing multi-statement closuresand observing which variables was captured.+These implementation details are irrelevant for most purposes, as they do not have an effect on the behavior of the programapart from the marginal cases listed in the next subsectionHowever, the exact behavior can be defined as follows:
-This retains the semantics described in the previous sectionso an understanding of these semantics is enough to reason about auto-capturing multi-statement closures.+  * If there is a possibility that a variable may be read by the function before binding it, it is captured 
 +  * When inspecting the code, the following operations are assumed to always bind a variable without reading it: 
 +    * Variable assignments 
 +    * Variable assignments by reference 
 +    * ''global'' 
 +    * ''static'' 
 +    * ''unset()'' 
 +    * This excludes assignments to object properties (they never bind the variable), assignments to array dimensions (they read the variable) 
 +  * In all other situations in which a variable is usedit is assumed that it is read
-==== Benchmarks ====+This optimization is not applied to Arrow Functions because variable bindings are unusual in these functions.
-In benchmarks, the implementation in the 1.0 version of this RFC showed a notable CPU and memory increase when using auto-capturing multi-statement closure in some cases.+==== Observable effects of capture ====
-The 2.0 versionproposed herehas only marginal impact compared to PHP 8.1, well within the margin of error for profiling tools. In some cases the profiling run shows the auto-capture version being slightly more performant, which is likely just random test jitter between runs.  We therefore conclude that the performance impact of this approach is effectively zero.+As long as the semantics are maintainedwhether a variable is captured or not is largely irrelevant for most purposesand can be observed only in marginal casesThese cases are listed here.
-The capture analysis approach described above makes auto-capturing multi-statement closures as efficient as long closures with explicit capture.+  * When debugging: Whether a variable is captured or not may be visible in the list of variables in scope in debuggers. Captured variables are local variables in the Closure, initialized to the captured value.\\ \\ 
 +  * Via reflection: Captured variables will be visible in ReflectionFunction.\\ \\ 
 +  * Via dynamic variable access: Means to access variables dynamically, such as the variable-variable syntax or the ''compact()'' function, whose use is largely discouraged in modern PHP, can only see variables that are captured.\\ \\ 
 +  * Via destructors: Capture can extend the lifetime of objects. Optimized capture will prevent this when the variable holding the object is never read before being written by the Closure. An observable effect is that a destructor would be called later if the object was captured. Note that destructor timing is undefined in PHP, especially when reference cycles exist.\\ \\ 
 +  * Via resource usage: Capturing too much could increase memory or CPU usage. The optimized capture used in this RFC prevents this. It ends up capturing the same variables that would have been captured by a manually curated ''use'' list.
-For more benchmark details, see: https://gist.github.com/arnaud-lb/d9adfbf786ce9e37c7157b0f9c8d8e13+==== Implementation details ====
-==== Why add another function mechanism? ====+The capture analysis used in this RFC will only capture the variables that may be read before being assigned by the function. This uses the Optimizer's implementation of [[https://en.wikipedia.org/wiki/Live-variable_analysis|live-variable analysis]].
-Long Closures in PHP can be quite verboseeven when they only perform a simple operation. This is due to a large amount of syntactic boilerplate that is needed in “long closures” to manually import used variables with the ''use'' keyword.+This maintains the semantics described earlierso an understanding of these semantics is enough to reason about Short Closures.
-While one-line arrow functions solve this problem to some extent, there are ample cases that require a 2-3 statement body.  That is still short enough that the chances of a developer confusing in-function and out-of-function variables is very remote, but the burden of manually closing over 3-4 variables is relatively high.+===== Benchmarks =====
-One example is when you are within a class method with multiple arguments and you want to simply return a closure that uses all the argumentsusing the “use” keyword to list all the arguments is entirely redundant and pointless.+In benchmarks, the implementation in the 1.0 version of this RFC showed a notable CPU and memory increase when using auto-capturing multi-statement closure in some cases.
-Then there are often use-cases with ''array_filter()'' and similar functions where the ''use()'' just adds visual noise to what the code actually means.+The 2.0 version, proposed here, has only marginal impact compared to PHP 8.1, well within the margin of error for profiling tools. In some cases the profiling run shows the Short Closure version being slightly more performant, which is likely just random test jitter between runs.  We therefore conclude that the performance impact of this approach is effectively zero.
-The trend in PHP in recent years has been toward more compact but still readable syntax that eliminates redundancy.  Property promotion, arrow functions, the nullsafe operator, and similar recent well-received additions demonstrate this trend.  This RFC seeks to continue that trend to make PHP more pleasant to write while still being just as clear to read.+The capture analysis approach described above makes Short Closures as efficient as Anonymous Functions.
-==== Methods ==== +For more benchmark detailssee: https://gist.github.com/arnaud-lb/d9adfbf786ce9e37c7157b0f9c8d8e13
- +
-As methods cannot be anonymousthere are no impacts on methods from this RFC.+
-==== What about long-closures? ====+===== What about Anonymous Functions=====
-The existing all-explicit multi-line closure syntax remains valid, and there is no intent to deprecate it.+The existing Anonymous Function syntax remains valid, and there is no intent to deprecate it.
-==== Multi-line expressions ====+===== Multi-line expressions =====
 There has been related discussion of multi-line expressions, specifically in the context of ''match()'' arms.  We considered whether multi-line expressions made sense as an alternative approach, but decided against it as that introduces considerably more edge cases both syntactically and in the engine. There has been related discussion of multi-line expressions, specifically in the context of ''match()'' arms.  We considered whether multi-line expressions made sense as an alternative approach, but decided against it as that introduces considerably more edge cases both syntactically and in the engine.
Line 245: Line 324:
 $c = ...; $c = ...;
 $ret = match ($a) { $ret = match ($a) {
-  1, 3, 5 => (fn() {+  1, 3, 5 => (fn () {
     $val = $a * $b;     $val = $a * $b;
     return $val * $c;     return $val * $c;
   })(),   })(),
-  2, 4, 6 => (fn() {+  2, 4, 6 => (fn () {
     $val = $a + $b;     $val = $a + $b;
     return $val + $c;     return $val + $c;
Line 258: Line 337:
 While sub-optimal, it may be sufficient for the few times that a multi-statement ''match()'' arm is needed. While sub-optimal, it may be sufficient for the few times that a multi-statement ''match()'' arm is needed.
-==== Examples ====+===== Comparison to other languages =====
-Closures are often used to "wrap" some behavior in other behavior.  One example provided by Mark Randall is for a throw-aware buffer The following is actual code he wrote:+As far as we are aware, only two languages in widespread use require variables to be explicitly closed over: PHP and C++.  All other major languages capture implicitly, as is proposed here.
-<code php> +Languages commonly capture by-variable (not to be confused with by-valueor by reference. In practice this can lead to confusing effectsespecially in loops. For that reasonPHP defaults to capturing by-valuewhich avoids this problem. This is discussed above in this RFCas well as in [[rfc:arrow_functions_v2#binding_behavior|Arrow Functions]].
-$x = function () use ($to, $library$thread$author$title, $library_name, $top_post) { +
-// ... +
-}; +
-From Mark: "That was just to get those variables inside a callback that could be  +===== History =====
-invoked inside a throw-aware buffering helper."+
-Another similar example is for wrapping behavior in a transaction Oftenthat is done by passing a callable to an ''inTransaction()'' method or similar.+The first discussion [[https://externals.io/message/28399|1]] around Anonymous Functions was objected to because of the lack of closures: It would be unusual for anonymous functions to not support closures, which would surprise users and limit the usefulness of the constructAt the same timeobjections against closures cited implementation difficulties and performance issues, as well as potential complexity or pitfalls most commonly found in other programming languages.
-<code php> +In the same and subsequent discussions [[https://externals.io/message/34040|2]] [[https://externals.io/message/38290|3]] a solution was proposed to use explicit capture with a new keyword''lexical''close in many aspects to the ''global'' keyword. Alternative syntaxes were later proposed that would allow to choose between by-reference and by-value capture, ultimately leading to the current ''use($x)'' syntax.
-public function savePost($user$date$title, $body, $tags) { +
-  return $this->db->inTransaction(function() use ($user, $date, $title, $body, $tags+
-    $this->db->query(...); +
-    $this->db->query(...); +
-    return $this->db->lastInsertId(); +
-  }); +
-In this case, the ''use''d variable listing is entirely redundant and pointlessmuch the same as constructor property promotion eliminated entirely redundant boilerplate (Though admittedly, the difference there was greater.)+It is unclear whether this was chosen because of technical concerns or concerns over semantics. Objections focusing on semantics appear to have been based on those most commonly found in other programming languages. These semantics differ significantly from what is proposed here. For instanceobjections cite the possibility of a kind of side-effects that would not exist with by-value captureDiscussions do not appear to have occurred in the light of by-value semantics.
-As notedinline callbacks may also need to capture multiple variables for only a short operation.+The [[rfc:short_closures|Short Closures 1.0]] RFC was declined for three main reasons [[https://externals.io/message/88394#88507|4]]: The syntaxthe lack of type declarations, and implicit capture. Objections to implicit closures appear to be based on semantics that do not exist in the current RFC.
-<code php> +The [[rfc:arrow_functions_v2|Arrow Functions 2.0]] RFC was accepted with a large majorityCompared to the Short Closures 1.0 RFC, it addressed the syntax and type hints concerns, limited the body to only one expression, and kept implicit closure by-value.
-/** @var Product[] */ +
-$arr = [ ... ];+
-$wantApproved true; +===== Alternative implementations =====
-$size 'L'+
-$filtered = array_filter($arrfunction ($item) use ($wantApproved$size): bool { +A few people suggested implementing the same functionality via a different syntaxthat isbasing it on the long-closure syntax with a ''use(*)'' or ''use(...)'' syntax to indicate "capture everything that makes sense" rather than building on the short-closure syntax which already "captures everything that makes sense."
-  if ($wantApproved) { +
-    return $item->isApproved()+
-  } else if ($size+
-    return $item->size() == $size; +
-  } else { +
-    return false; +
-  } +
-}); +
-In this case, again, the explicit ''use'' clause offers no clarityonly visual noise.+The resulting behavior in either case would be identicalmaking it a largely aesthetic or philosophical distinction. The authors felt that the more compact syntax is preferablefor several reasons:
-==== Comparison to other languages ==== +  - The longer form introduces more visual noise to achieve the same result. 
- +  - PHP developers have been using the ''fn()'' syntax for a number of years now, and should be sufficiently familiar with the concept of auto-capture. 
-As far as we are awareonly two languages in widespread use require variables to be explicitly closed over: PHP and C++.  All other major languages capture implicitlyas is proposed here.+  - With the improved capture logicmany of the arguments for the explicit capture syntax go away. 
 +  - Using the longer ''function'' keyword without a ''use'' statement at all would be a semantic BC break, which is not acceptable. 
 +  - If converting from a single line short-lambda to a 2 line closureswitching to the long-form syntax is more work than just switching ''=>'' for ''{}''.
-Many languages tend to capture by variable. In practice this can lead to surprising effectsespecially in loops+For those reasonsthe authors went with the ''fn()''-derived syntax shown here.
 ===== Backward Incompatible Changes ===== ===== Backward Incompatible Changes =====
Line 331: Line 387:
 ===== Future Scope ===== ===== Future Scope =====
-The proposal section detailed three additional possible combinations of function functionality that are not included here.  While it is not likely that they have much use, the pattern here clearly lays out what they would be were a future RFC to try and implement them.+These are some possible future extensionsbut the authors don't necessarily endorse them.
-Specifically, they would be:+==== Explicit use list on Short Closures ==== 
 +It would be possible to extend the Short Closure syntax to allow an explicit use list:
 <code php> <code php>
-// Global scope +$fn fn () use ($a, &$b) { 
-fn foo($a, $b): int { +One anticipated use-case is to selectively capture some variables by-reference.
-  $val = $a * $b; +
-  return $val * $c; +
-fn foo($a$b): int => $a * $b * $c;+There are at least two possible variations of this extension. In one of themthe use list is merged with auto-capture, so that explicit uses and auto-capture can coexist on the same function. In another the use list disables auto-capture on the function.
-$foo = function($a, $b) use ($c): int => $a * $b * $c; +This RFC initially proposed the first possibility. This is not included in the current version because this appeared to create confusion.
-Those versions are //not// included in this RFC.  +==== Optimize Arrow Functions ====
-===== Proposed Voting Choices =====+This RFC proposes an optimized auto-capture. It would be possible to apply this optimization to Arrow Functions as well, but this would be a breaking change in some rare cases.
-This is a simple Yes/No vote, requiring 2/3 to pass.+This is not included in this RFC because most Arrow Functions would not benefit from this. 
 +===== Vote ===== 
 +This is a simple Yes/No vote, requiring 2/3 to pass.  Vote ends on 15 July 2022. 
 +<doodle title="Add Short Closures as described in PHP 8.2?" auth="jrf" voteType="single" closed="false"> 
 +   * Yes 
 +   * No 
 ===== Patches and Tests ===== ===== Patches and Tests =====
Line 367: Line 432:
 ===== References ===== ===== References =====
-[[rfc:short-functions|PHP RFC: Short Functions]]+  * [[rfc:short-functions|PHP RFC: Short Functions]]   
 +  * [[rfc:arrow_functions_v2|PHP RFC: Arrow Functions]]   
 +  * [[rfc:short_closures|PHP RFC: Short Closures 1.0]]
 ===== Changelog ===== ===== Changelog =====
-2.0: Updated for new patch; reduced discussion of short-function RFC and related topics; expanded discussion of the capture rules and noted benchmarks showing minimal performance impact+2.0: Updated for new patch; reduced discussion of short-function RFC and related topics; expanded discussion of the capture rules and noted benchmarks showing minimal performance impact; renamed to "Short Closures 2.0" 
rfc/auto-capture-closure.1653642770.txt.gz · Last modified: 2022/05/27 09:12 by lbarnaud