rfc:closures
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
rfc:closures [2008/06/26 15:44] – Revised patch, using objects instead of resources, added tests chris_se | rfc:closures [2009/03/17 14:27] – Note $this isn't available any more scottmac | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Request for Comments: Lambda functions and closures ====== | ====== Request for Comments: Lambda functions and closures ====== | ||
- | * Version: 1.1 | + | * Version: 1.2 |
- | * Date: 2008-06-26 | + | * Date: 2008-07-01 |
- | * Author: Christian Seiler < | + | * Author: Christian Seiler < |
- | * Thanks to: Dmitry Stogov < | + | * Status: |
- | * Status: | + | |
This RFC discusses the introduction of compile-time lambda functions and closures in PHP. | This RFC discusses the introduction of compile-time lambda functions and closures in PHP. | ||
Line 11: | Line 10: | ||
End of 2007 a patch was proposed that would add lambda functions (but without closures) to PHP. During the discussion on the mailing list, several people suggested that without support for closures, lambda functions are not useful enough to add them to PHP. This proposal describes a viable method of adding lambda functions with closure support to PHP. | End of 2007 a patch was proposed that would add lambda functions (but without closures) to PHP. During the discussion on the mailing list, several people suggested that without support for closures, lambda functions are not useful enough to add them to PHP. This proposal describes a viable method of adding lambda functions with closure support to PHP. | ||
+ | |||
+ | The initial posting of this proposal has created quite a bit of discussion on the list. This updated proposal including an updated patch intends to incorporate the result of that discussion. A lot of changes to the original patch by Christian Seiler were made by Dmitry Stogov. | ||
===== Why do we need closures and lambda functions? ===== | ===== Why do we need closures and lambda functions? ===== | ||
Line 18: | Line 19: | ||
==== Lambda Functions ==== | ==== Lambda Functions ==== | ||
- | Lambda functions allow the quick definition of throw-away functions that are not used elsewhere. | + | Lambda functions allow the quick definition of throw-away functions that are not used elsewhere. |
- Define the callback function elsewhere. This distributes code that belongs together throughout the file and decreases readability. | - Define the callback function elsewhere. This distributes code that belongs together throughout the file and decreases readability. | ||
Line 33: | Line 34: | ||
| | ||
} | } | ||
- | </ | + | </ |
- Use the present create_function() in order to create a function at runtime. This approach has several disadvantages: | - Use the present create_function() in order to create a function at runtime. This approach has several disadvantages: | ||
Line 39: | Line 40: | ||
==== Closures ==== | ==== Closures ==== | ||
- | Closures provide a very useful tool in order to make lambda functions even more useful. Just imagine you want to replace ' | + | Closures provide a very useful tool in order to make lambda functions even more useful. Just imagine you want to replace ' |
- Use create_function(). But then you may only pass literal values (strings, integers, floats) into the function, objects at best as clones (if var_export() allows for it) and resources not at all. And you have to worry about escaping everything correctly. Especially when handling user input this can lead to all sorts of security issues. | - Use create_function(). But then you may only pass literal values (strings, integers, floats) into the function, objects at best as clones (if var_export() allows for it) and resources not at all. And you have to worry about escaping everything correctly. Especially when handling user input this can lead to all sorts of security issues. | ||
Line 53: | Line 54: | ||
===== Common misconceptions ===== | ===== Common misconceptions ===== | ||
- | ? | + | - Lambda functions / closures are **not** a way of dynamically extending classes by additional methods at runtime. There are several other possibilities to do this, including the already present _ _call semantic. |
+ | |||
+ | - PHP's notion of scope is quite different than the notion of scope other languages define. Combine this with variable variables ($$var) and it becomes clear that automatically detecting which variables from the outer scope are referenced inside are closure is impossible. Also, since for example global variables are not visible inside functions either by default, automatically making the parent scope available would break with the current language concept PHP follows. | ||
===== Proposal and Patch ===== | ===== Proposal and Patch ===== | ||
The following proposal and patch implement compile-time lambda functions and closures for PHP while keeping the patch as simple as possible. | The following proposal and patch implement compile-time lambda functions and closures for PHP while keeping the patch as simple as possible. | ||
- | |||
==== Userland perspective ==== | ==== Userland perspective ==== | ||
Line 65: | Line 67: | ||
The patch adds the following syntax as a valid expression: | The patch adds the following syntax as a valid expression: | ||
- | <code php> | + | <code php> |
- | | + | |
- | </ | + | </ |
- | (The & is optional and indicates | + | The & is optional and indicates that the function |
Example usage: | Example usage: | ||
- | <code php> | + | <code php> |
| | ||
- | </ | + | </ |
The variable $lambda then contains a callable resource that may be called through different means: | The variable $lambda then contains a callable resource that may be called through different means: | ||
- | <code php> | + | <code php> |
| | ||
| | ||
| | ||
- | </ | + | </ |
This allows for simple lambda functions, for example: | This allows for simple lambda functions, for example: | ||
- | <code php> | + | <code php> |
| | ||
| | ||
Line 94: | Line 96: | ||
| | ||
} | } | ||
- | </ | + | </ |
+ | |||
+ | You can even put the lambda function inline, for example: | ||
+ | |||
+ | <code php> | ||
+ | function replace_spaces ($text) { | ||
+ | return preg_replace_callback ('/( +) /', | ||
+ | function ($matches) { | ||
+ | return str_replace ($matches[1], | ||
+ | }, $text); | ||
+ | } | ||
+ | </ | ||
=== Closure support === | === Closure support === | ||
- | Since there was some discussion on the PHP internals list on about what the best syntax | + | In order to make use of variables defined in the parent scope, this patch proposes |
- | == Closure support via '' | + | <code php> |
+ | function (normal parameters) use ($var1, $var2, & | ||
+ | </ | ||
- | The patch implements closures by defining an additional keyword ' | + | The variables $var1, $var2 and $refvar defined in the parent scope will be visible inside |
- | <code php> | + | Simple example: |
+ | |||
+ | <code php> | ||
| | ||
- | $map = function ($text) | + | $map = function ($text) |
- | | + | |
if (strpos ($text, $search) > 50) { | if (strpos ($text, $search) > 50) { | ||
| | ||
Line 114: | Line 130: | ||
} | } | ||
}; | }; | ||
- | | + | |
} | } | ||
- | </ | + | </ |
- | The variables $search and $replacement are variables in the scope of the function replace_in_array() and the lexical keyword imports these variables | + | The variables $search and $replacement are variables in the scope of the function replace_in_array() and they are imported |
- | The current patch imports variables as a value and allows to import them as a reference if an & is supplied before the variable name. The referencing behaviour is still subject to discussion. | + | === Closure lifetime === |
- | == Closures | + | Closures |
- | Another patch implements closures by using the ' | + | <code php> |
- | + | | |
- | <code php> | + | return |
- | | + | // or: lexical |
- | $map = function ($text) use ($search, $replacement) { | + | |
- | if (strpos ($text, $search) > 50) { | + | |
- | | + | |
- | } else { | + | |
- | return $text; | + | |
- | } | + | |
}; | }; | ||
- | | ||
} | } | ||
- | </ | + | </ |
- | The variables $search and $replacement are variables in the scope of the function replace_in_array() and they are imported into the scope of the closure upon creation of the closure. | + | === References vs. Copies === |
- | The current patch imports | + | By default, all imported |
+ | |||
+ | Example: | ||
+ | |||
+ | <code php> | ||
+ | $x = 1; | ||
+ | $lambda1 = function () use ($x) { | ||
+ | $x *= 2; | ||
+ | }; | ||
+ | $lambda2 = function () use (&$x) { | ||
+ | $x *= 3; | ||
+ | }; | ||
+ | $lambda1 (); | ||
+ | var_dump ($x); // gives: 1 | ||
+ | $lambda2 (); | ||
+ | var_dump ($x); // gives: 3 | ||
+ | </ | ||
+ | |||
+ | Support for references are necessary in order to achieve true closures (like in Javascript, where a variable originating in parent scope can be modified by closures) while copying per default fits best with the current semantics of PHP and does not cause headaches in loops (for example, when importing a loop index into a closure). | ||
=== Interaction with OOP === | === Interaction with OOP === | ||
- | If a closure is defined inside an object, the closure has full access to the current object through | + | $this support has been removed, see [[rfc/closures/ |
- | <code php> | + | If a closure is defined inside an object, the closure has full access to the current object through $this (without the need to import it explicitly) and all private and protected methods of that class. This also applies to nested closures. Example: |
+ | |||
+ | <code php> | ||
class Example { | class Example { | ||
| | ||
Line 161: | Line 191: | ||
| | ||
| | ||
- | // or: lexical $replacement; | ||
| | ||
}; | }; | ||
Line 170: | Line 199: | ||
| | ||
echo $replacer (' | echo $replacer (' | ||
- | $replacer-> | + | $example-> |
echo $replacer (' | echo $replacer (' | ||
- | </ | + | </ |
- | As one can see, defining a closure inside a class method does not change the semantics at all - it simply does not matter if a closure is defined in global scope, within a function or within a class method. The only small difference is that closures defined in class methods may also access the class and the current object via $this. | + | As one can see, defining a closure inside a class method does not change the semantics at all - it simply does not matter if a closure is defined in global scope, within a function or within a class method. The only small difference is that closures defined in class methods may also access the class and the current object via $this. Since $this is saved " |
- | === Closure lifetime === | + | Because not all closures defined in class methods need $this, it is possible to declare a lambda function to be static: |
- | Closures may live longer as the methods that declared them. It is perfectly possible to have something like this: | + | <code php> |
+ | class Example { | ||
+ | | ||
+ | $x = 4; | ||
+ | | ||
+ | | ||
+ | }; | ||
+ | | ||
+ | } | ||
+ | } | ||
+ | </ | ||
- | <code php> | + | In this case, $this is not available inside the closure. This may save a lot of memory if saves many closures that originated in longer needed objects. |
- | | + | |
- | return | + | |
- | // or: lexical | + | ==== Additional goody: _ _invoke ==== |
- | return | + | |
- | | + | Since closures implement a new type of variable that may be called dynamically (i.e. objects), the idea came up that generic callable could also be implemented. This patch adds an additional magic method _ _invoke that may be defined in arbitrary classes. If defined, the object itself is callable and the new special method will be invoked instead of the object. Example: |
- | | + | |
- | </ | + | <code php> |
+ | class Example { | ||
+ | public | ||
+ | echo "Hello World!\n"; | ||
+ | } | ||
+ | } | ||
+ | $foo = new Example; | ||
+ | $foo (); | ||
+ | </ | ||
+ | |||
+ | ==== Interaction with reflection (1) ==== | ||
+ | |||
+ | Since closures are anonymous, they do **not** appear in reflection. | ||
+ | |||
+ | However, a new method was added to the ReflectionMethod and ReflectionFunction classes: getClosure. This method returns a dynamically created closure for the specified function. Example: | ||
+ | |||
+ | <code php> | ||
+ | class Example | ||
+ | | ||
+ | } | ||
+ | |||
+ | $class = new ReflectionClass (' | ||
+ | $method = $class-> | ||
+ | $closure = $method-> | ||
+ | $closure (); | ||
+ | </code> | ||
+ | |||
+ | This example dynamically creates a callable object of the static method " | ||
+ | |||
+ | <code php> | ||
+ | class Example { | ||
+ | public | ||
+ | | ||
+ | } | ||
+ | |||
+ | $class = new ReflectionClass (' | ||
+ | $method = $class-> | ||
+ | |||
+ | $object = new Example; | ||
+ | $closure = $method-> | ||
+ | $closure (); | ||
+ | $object-> | ||
+ | $closure (); | ||
+ | </ | ||
+ | |||
+ | ==== Interaction with reflection (2) ==== | ||
+ | |||
+ | In addition to the previous patch, reflection support was augmented to support reflecting closure objects and returning the correct function pointer. | ||
+ | |||
+ | <code php> | ||
+ | $closure = function ($a, &$b, $c = null) { }; | ||
+ | $m = new ReflectionMethod ($closure, ' | ||
+ | Reflection:: | ||
+ | </ | ||
+ | |||
+ | This will yield: | ||
+ | |||
+ | < | ||
+ | Method [ < | ||
+ | |||
+ | - Parameters [3] { | ||
+ | Parameter #0 [ < | ||
+ | Parameter #1 [ < | ||
+ | Parameter #2 [ < | ||
+ | | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | The following will also work (invoke is implied if no method name is specified): | ||
+ | |||
+ | <code php> | ||
+ | $m = new ReflectionMethod ($closure); | ||
+ | $p = new ReflectionParameter ($closure, 0); | ||
+ | $p = new ReflectionParameter ($closure, ' | ||
+ | $p = new ReflectionParameter (array ($closure, ' | ||
+ | </ | ||
==== Zend internal perspective ==== | ==== Zend internal perspective ==== | ||
Line 193: | Line 307: | ||
The patch basically changes the following in the Zend engine: | The patch basically changes the following in the Zend engine: | ||
- | (Quite a lot here is based on an updated patch provided by Dmitry Stogov.) | + | When the compiler reaches a lambda function, except for details in the grammar, a new function zend_do_begin_lambda_function_declaration is called - which itself calls zend_do_begin_function_declaration with " |
- | + | ||
- | When the compiler reaches a lambda function, except for details in the grammar, a new function zend_do_begin_lambda_function_declaration is called - which itself calls zend_do_begin_function_declaration with " | + | |
- | Lexical variables are done via static variables: For each lexical variable an entry in the static variables hash table is added with the special semantics for the default | + | Lexical variables are done via static variables: For each lexical variable an entry in the static variables hash table is added. The entry is default |
- | An additional internal class " | + | An additional internal class " |
The ZEND_DECLARE_LAMBDA_FUNCTION opcode looks up the function in the function table (it still has its runtime function key the compiler gave it and is thus cacheable by any opcode cache), creates a new object of the Closure type and stores a copy of the op_array inside. It correctly sets the scope of the copied op_array to be the current class scope and makes sure all lexical variables are imported from the parent scope into the copied hash table of the new op_array. It also creates a reference to the current $this object. It returns the newly created object. | The ZEND_DECLARE_LAMBDA_FUNCTION opcode looks up the function in the function table (it still has its runtime function key the compiler gave it and is thus cacheable by any opcode cache), creates a new object of the Closure type and stores a copy of the op_array inside. It correctly sets the scope of the copied op_array to be the current class scope and makes sure all lexical variables are imported from the parent scope into the copied hash table of the new op_array. It also creates a reference to the current $this object. It returns the newly created object. | ||
Line 205: | Line 317: | ||
Some hooks were added to the opcode handlers, zend_call_function and zend_is_callable_ex that allow the ' | Some hooks were added to the opcode handlers, zend_call_function and zend_is_callable_ex that allow the ' | ||
- | Note that since closures that don't use $this may potentially carry on the object far too long. Therefore an additional patch variant is provided in which the compiler remembers for each lambda function whether $this is used. For that, an additional op_array-> | + | In order to make code changes as clean as possible, |
+ | |||
+ | ==== Tests ==== | ||
+ | |||
+ | The patch contains additional phpt tests that make sure closures | ||
==== The patch ==== | ==== The patch ==== | ||
- | There are six variants of the patch: | + | **Note:** The patches were already applied to PHP_5_3 and HEAD (with some minor modifications and fixes). |
- | * [[http:// | + | Current patches: |
- | * [[http:// | + | |
- | * [[http:// | + | |
- | * (The combination lexical + $this always stored is of course also possible) | + | |
- | * [[http:// | + | |
- | * [[http:// | + | |
- | * [[http:// | + | |
- | All patches contain a series of additional tests that ensure the correct functioning of closures. | + | * [[http:// |
+ | * [[http:// | ||
Older patches for completeness: | Older patches for completeness: | ||
+ | * [[http:// | ||
+ | * [[http:// | ||
+ | * [[http:// | ||
+ | * [[http:// | ||
+ | * [[http:// | ||
+ | * [[http:// | ||
* [[http:// | * [[http:// | ||
* [[http:// | * [[http:// | ||
Line 230: | Line 347: | ||
==== BC breaks ==== | ==== BC breaks ==== | ||
- | | + | |
- | | + | |
==== Caveats / possible WTFs ==== | ==== Caveats / possible WTFs ==== | ||
Line 237: | Line 354: | ||
=== Trailing '';'' | === Trailing '';'' | ||
- | On writing '' | + | On writing '' |
- | === References | + | === Misinterpretations of the goal of closures |
- | References seem to be a major WTF w.r.t. closures. | + | As the discussion on the mailing list showed, there were quite a few misconceptions on what closures may or may not achieve. One often used suggestion was to use closures |
- | The previous proposals had the notion of always creating references, so the following would be a WTF: | + | ===== Example code ===== |
- | <code php> | + | The example |
- | for ($i = 0; $i < 10; $i++) { | + | |
- | $arr[$i] = function () { lexical $i; return $i; }; | + | |
- | } | + | |
- | </ | + | |
- | + | ||
- | This would not have worked as expected since $i is a reference and thus all created closures would reference the same variable. In order to get this right one would have had to do: | + | |
- | + | ||
- | < | + | |
- | for ($i = 0; $i < 10; $i++) { | + | |
- | | + | |
- | | + | |
- | unset ($loopIndex); | + | |
- | } | + | |
- | </code> | + | |
- | + | ||
- | This can be a WTF for people that don't expect lexical to create an actual reference. On the other hand, global and static both DO create references so that behaviour is consistent with current PHP **and** (as pointed out on the mailing list) other languages such as JavaScript also behave the same way, so we really should stay consistent. | + | |
- | + | ||
- | == Current proposal == | + | |
- | + | ||
- | For the sake of discussion, all the current patches allow the programmer to choose whether variables are to be referenced or copied. The current patches default to copying them and allowing the use of an & to reference variables. **The only reason this is done** is to demonstrate that both is easily possible and that it is **only** the grammar that needs to be changed in order to change the behaviour, not the code itself. | + | |
- | + | ||
- | The discussion on the mailing list will have to come to a conclusion on which behaviour would be the best for PHP. | + | |
- | + | ||
- | === '' | + | |
- | + | ||
- | The fact that ' | + | |
===== Changelog ==== | ===== Changelog ==== | ||
+ | * 2008-08-11 Christian Seiler: Documented additional reflection improvements (see php-internals) | ||
+ | * 2008-07-15 Christian Seiler: Updated status of this RFC | ||
+ | * 2008-07-01 Christian Seiler: Updated patch yet again | ||
* 2008-06-26 Christian Seiler: Revised patch, using objects instead of resources, added tests | * 2008-06-26 Christian Seiler: Revised patch, using objects instead of resources, added tests | ||
* 2008-06-18 Christian Seiler: OOP clarifications | * 2008-06-18 Christian Seiler: OOP clarifications | ||
Line 281: | Line 375: | ||
* 2008-06-16 Christian Seiler: Small changes | * 2008-06-16 Christian Seiler: Small changes | ||
* 2008-06-16 Christian Seiler: Initial creation | * 2008-06-16 Christian Seiler: Initial creation | ||
- |
rfc/closures.txt · Last modified: 2017/09/22 13:28 by 127.0.0.1