rfc:closures
Differences
This shows you the differences between two versions of the page.
rfc:closures [2008/06/26 15:44] chris_se Revised patch, using objects instead of resources, added tests |
rfc:closures [2017/09/22 13:28] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Request for Comments: Lambda functions and closures ====== | ||
- | * Version: 1.1 | ||
- | * Date: 2008-06-26 | ||
- | * Author: Christian Seiler < | ||
- | * Thanks to: Dmitry Stogov < | ||
- | * Status: Under Discussion | ||
- | |||
- | This RFC discusses the introduction of compile-time lambda functions and closures in PHP. | ||
- | |||
- | ===== Introduction ===== | ||
- | |||
- | End of 2007 a patch was proposed that would add lambda functions (but without closures) to PHP. During the discussion on the mailing list, several people suggested that without support for closures, lambda functions are not useful enough to add them to PHP. This proposal describes a viable method of adding lambda functions with closure support to PHP. | ||
- | |||
- | ===== Why do we need closures and lambda functions? ===== | ||
- | |||
- | Closures and lambda functions can make programming much easier in several ways: | ||
- | |||
- | ==== Lambda Functions ==== | ||
- | |||
- | Lambda functions allow the quick definition of throw-away functions that are not used elsewhere. Imaging for example a piece of code that needs to call preg_replace_callback(). Currently, there are three possibilities to achieve this: | ||
- | |||
- | - Define the callback function elsewhere. This distributes code that belongs together throughout the file and decreases readability. | ||
- | |||
- | - Define the callback function in-place (but with a name). In that case one has to use function_exists() to make sure the function is only defined once. Here, the additional if() around the function definition makes the source code difficult to read. Example code: | ||
- | |||
- | <code php> | ||
- | | ||
- | if (!function_exists (' | ||
- | | ||
- | | ||
- | } | ||
- | } | ||
- | | ||
- | } | ||
- | </ | ||
- | |||
- | - Use the present create_function() in order to create a function at runtime. This approach has several disadvantages: | ||
- | |||
- | ==== Closures ==== | ||
- | |||
- | Closures provide a very useful tool in order to make lambda functions even more useful. Just imagine you want to replace ' | ||
- | |||
- | - Use create_function(). But then you may only pass literal values (strings, integers, floats) into the function, objects at best as clones (if var_export() allows for it) and resources not at all. And you have to worry about escaping everything correctly. Especially when handling user input this can lead to all sorts of security issues. | ||
- | |||
- | - Write a function that uses global variables. This is ugly, non-reentrant and bad style. | ||
- | |||
- | - Create an entire class, instantiate it and pass the member function as a callback. This is perhaps the cleanest solution for this problem with current PHP but just think about it: Creating an entire class for this extremely simple purpose and nothing else seems overkill. | ||
- | |||
- | - Don't use array_map() but simply do it manually (foreach). In this simple case it may not be that much of an issue (because one simply wants to iterate over an array) but there are cases where doing something manually that a function with a callback as parameter does for you is quite tedious. | ||
- | |||
- | Note: str_replace also accepts arrays as a third parameter so this example may be a bit useless. But imagine you want to do a more complex operation than simple search and replace. | ||
- | |||
- | ===== Common misconceptions ===== | ||
- | |||
- | ? | ||
- | |||
- | ===== Proposal and Patch ===== | ||
- | |||
- | The following proposal and patch implement compile-time lambda functions and closures for PHP while keeping the patch as simple as possible. | ||
- | |||
- | ==== Userland perspective ==== | ||
- | |||
- | === Lambda function syntax === | ||
- | |||
- | The patch adds the following syntax as a valid expression: | ||
- | |||
- | < | ||
- | | ||
- | </ | ||
- | |||
- | (The & is optional and indicates - just as with normal functions - that the anonymous function returns a reference instead of a value) | ||
- | |||
- | Example usage: | ||
- | |||
- | < | ||
- | | ||
- | </ | ||
- | |||
- | The variable $lambda then contains a callable resource that may be called through different means: | ||
- | |||
- | < | ||
- | | ||
- | | ||
- | | ||
- | </ | ||
- | |||
- | This allows for simple lambda functions, for example: | ||
- | |||
- | < | ||
- | | ||
- | | ||
- | | ||
- | }; | ||
- | | ||
- | } | ||
- | </ | ||
- | |||
- | === Closure support === | ||
- | |||
- | Since there was some discussion on the PHP internals list on about what the best syntax for importing parent-scoped variables into closures would be, several patches are proposed, mainly to demonstrate that the basic code can remain exactly the same while only the grammar of PHP has to change in order for a different syntax. | ||
- | |||
- | == Closure support via '' | ||
- | |||
- | The patch implements closures by defining an additional keyword ' | ||
- | |||
- | < | ||
- | | ||
- | $map = function ($text) { | ||
- | | ||
- | if (strpos ($text, $search) > 50) { | ||
- | | ||
- | } else { | ||
- | | ||
- | } | ||
- | }; | ||
- | | ||
- | } | ||
- | </ | ||
- | |||
- | The variables $search and $replacement are variables in the scope of the function replace_in_array() and the lexical keyword imports these variables into the scope of the closure. | ||
- | |||
- | The current patch imports variables as a value and allows to import them as a reference if an & is supplied before the variable name. The referencing behaviour is still subject to discussion. | ||
- | |||
- | == Closures support via '' | ||
- | |||
- | Another patch implements closures by using the ' | ||
- | |||
- | < | ||
- | | ||
- | $map = function ($text) use ($search, $replacement) { | ||
- | if (strpos ($text, $search) > 50) { | ||
- | | ||
- | } else { | ||
- | | ||
- | } | ||
- | }; | ||
- | | ||
- | } | ||
- | </ | ||
- | |||
- | The variables $search and $replacement are variables in the scope of the function replace_in_array() and they are imported into the scope of the closure upon creation of the closure. | ||
- | |||
- | The current patch imports variables as a value and allows to import them as a reference if an & is supplied before the variable name. The referencing behaviour is still subject to discussion. | ||
- | |||
- | === Interaction with OOP === | ||
- | |||
- | If a closure is defined inside an object, the closure has full access to the current object through $this (without the need to import it explicitely) and all private and protected methods of that class. This also applies to nested closures. Example: | ||
- | |||
- | < | ||
- | class Example { | ||
- | | ||
- | |||
- | | ||
- | | ||
- | } | ||
- | |||
- | | ||
- | | ||
- | } | ||
- | |||
- | | ||
- | | ||
- | // or: lexical $replacement; | ||
- | | ||
- | }; | ||
- | } | ||
- | } | ||
- | |||
- | | ||
- | | ||
- | echo $replacer (' | ||
- | | ||
- | echo $replacer (' | ||
- | </ | ||
- | |||
- | As one can see, defining a closure inside a class method does not change the semantics at all - it simply does not matter if a closure is defined in global scope, within a function or within a class method. The only small difference is that closures defined in class methods may also access the class and the current object via $this. | ||
- | |||
- | === Closure lifetime === | ||
- | |||
- | Closures may live longer as the methods that declared them. It is perfectly possible to have something like this: | ||
- | |||
- | < | ||
- | | ||
- | | ||
- | // or: lexical $x; | ||
- | | ||
- | }; | ||
- | } | ||
- | </ | ||
- | |||
- | ==== Zend internal perspective ==== | ||
- | |||
- | The patch basically changes the following in the Zend engine: | ||
- | |||
- | (Quite a lot here is based on an updated patch provided by Dmitry Stogov.) | ||
- | |||
- | When the compiler reaches a lambda function, except for details in the grammar, a new function zend_do_begin_lambda_function_declaration is called - which itself calls zend_do_begin_function_declaration with " | ||
- | |||
- | Lexical variables are done via static variables: For each lexical variable an entry in the static variables hash table is added with the special semantics for the default value being either an empty string or a string containing a single 0-char. But instead of a normal string zval, the type is forced to be IS_CONSTANT. An empty constant or a constant containing only a 0-char will never be added by normal PHP so with this it is possible to distinguish them from normal constants as values. The len 0 strings indicate that the variable should be referenced from parent scope and the len 1 strings indicate it should be copied. | ||
- | |||
- | An additional internal class " | ||
- | |||
- | The ZEND_DECLARE_LAMBDA_FUNCTION opcode looks up the function in the function table (it still has its runtime function key the compiler gave it and is thus cacheable by any opcode cache), creates a new object of the Closure type and stores a copy of the op_array inside. It correctly sets the scope of the copied op_array to be the current class scope and makes sure all lexical variables are imported from the parent scope into the copied hash table of the new op_array. It also creates a reference to the current $this object. It returns the newly created object. | ||
- | |||
- | Some hooks were added to the opcode handlers, zend_call_function and zend_is_callable_ex that allow the ' | ||
- | |||
- | Note that since closures that don't use $this may potentially carry on the object far too long. Therefore an additional patch variant is provided in which the compiler remembers for each lambda function whether $this is used. For that, an additional op_array-> | ||
- | |||
- | ==== The patch ==== | ||
- | |||
- | There are six variants of the patch: | ||
- | |||
- | * [[http:// | ||
- | * [[http:// | ||
- | * [[http:// | ||
- | * (The combination lexical + $this always stored is of course also possible) | ||
- | * [[http:// | ||
- | * [[http:// | ||
- | * [[http:// | ||
- | |||
- | All patches contain a series of additional tests that ensure the correct functioning of closures. | ||
- | |||
- | Older patches for completeness: | ||
- | |||
- | * [[http:// | ||
- | * [[http:// | ||
- | |||
- | **Note** The patch does not contain the diff for '' | ||
- | |||
- | ==== BC breaks ==== | ||
- | |||
- | * (lexical variant) Introduction of a new keyword ' | ||
- | * **No** tests are broken by the patch. | ||
- | |||
- | ==== Caveats / possible WTFs ==== | ||
- | |||
- | === Trailing '';'' | ||
- | |||
- | On writing '' | ||
- | |||
- | === References === | ||
- | |||
- | References seem to be a major WTF w.r.t. closures. The problem is that unlike other languages, PHPs local scope is always that of the current function, an " | ||
- | |||
- | The previous proposals had the notion of always creating references, so the following would be a WTF: | ||
- | |||
- | < | ||
- | for ($i = 0; $i < 10; $i++) { | ||
- | | ||
- | } | ||
- | </ | ||
- | |||
- | This would not have worked as expected since $i is a reference and thus all created closures would reference the same variable. In order to get this right one would have had to do: | ||
- | |||
- | < | ||
- | for ($i = 0; $i < 10; $i++) { | ||
- | | ||
- | | ||
- | unset ($loopIndex); | ||
- | } | ||
- | </ | ||
- | |||
- | This can be a WTF for people that don't expect lexical to create an actual reference. On the other hand, global and static both DO create references so that behaviour is consistent with current PHP **and** (as pointed out on the mailing list) other languages such as JavaScript also behave the same way, so we really should stay consistent. | ||
- | |||
- | == Current proposal == | ||
- | |||
- | For the sake of discussion, all the current patches allow the programmer to choose whether variables are to be referenced or copied. The current patches default to copying them and allowing the use of an & to reference variables. **The only reason this is done** is to demonstrate that both is easily possible and that it is **only** the grammar that needs to be changed in order to change the behaviour, not the code itself. | ||
- | |||
- | The discussion on the mailing list will have to come to a conclusion on which behaviour would be the best for PHP. | ||
- | |||
- | === '' | ||
- | |||
- | The fact that ' | ||
- | |||
- | ===== Changelog ==== | ||
- | |||
- | * 2008-06-26 Christian Seiler: Revised patch, using objects instead of resources, added tests | ||
- | * 2008-06-18 Christian Seiler: OOP clarifications | ||
- | * 2008-06-17 Christian Seiler: Updated patch | ||
- | * 2008-06-17 Christian Seiler: Clarified interaction with OOP | ||
- | * 2008-06-16 Christian Seiler: Small changes | ||
- | * 2008-06-16 Christian Seiler: Initial creation | ||
rfc/closures.txt · Last modified: 2017/09/22 13:28 (external edit)