rfc:abstract_syntax_tree

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
rfc:abstract_syntax_tree [2014/08/18 16:34] – Start vote nikicrfc:abstract_syntax_tree [2017/09/22 13:28] (current) – external edit 127.0.0.1
Line 2: Line 2:
   * Date: 2014-07-28   * Date: 2014-07-28
   * Author: Nikita Popov <nikic@php.net>   * Author: Nikita Popov <nikic@php.net>
-  * Status: In Voting +  * Status: Implemented (in PHP 7)
-  * Targeting: PHP 7+
   * Discussion: http://markmail.org/message/br4ixewsnqitrx3n   * Discussion: http://markmail.org/message/br4ixewsnqitrx3n
  
Line 96: Line 95:
  
 ==== Changes to list() ==== ==== Changes to list() ====
 +
 +> **Note**: The behavior of ''list($a, $b) = $a'' described below no longer applies. After this RFC was accepted ''list()'' assignments that contain the same variable on the left- and right-hand side have been special cased to ensure the right-hand side always evaluates first. This means that ''list($a, $b) = $a'' continues working as expected.
  
 ''list()'' currently assigns variables right-to-left, the AST implementation will assign them left-to-right instead: ''list()'' currently assigns variables right-to-left, the AST implementation will assign them left-to-right instead:
Line 148: Line 149:
  
 ==== Auto-vivification order for by-reference assignments ==== ==== Auto-vivification order for by-reference assignments ====
 +
 +> **Note**: The auto-vivification order for reference assignments has been restored to the old behavior in PHP 7.1. The reason for this is that we found hard to avoid memory safety issues with the new order.
  
 While by-reference assignments are (CVs notwithstanding) evaluated left-to-right, auto-vivification currently occurs right-to-left. In the AST implementation this will happen left-to-right instead: While by-reference assignments are (CVs notwithstanding) evaluated left-to-right, auto-vivification currently occurs right-to-left. In the AST implementation this will happen left-to-right instead:
Line 179: Line 182:
  
 Doing calls like ''%%$obj->__clone()%%'' is now allowed. This was the only magic method that had a compile-time check preventing some calls to it, which doesn't make sense. If we allow all other magic methods to be called, there's no reason to forbid this one. Doing calls like ''%%$obj->__clone()%%'' is now allowed. This was the only magic method that had a compile-time check preventing some calls to it, which doesn't make sense. If we allow all other magic methods to be called, there's no reason to forbid this one.
- 
-===== Additional possibilities (not implemented) ===== 
- 
-The generated AST can be exposed to userland via an extension, for use by static analysers. This should be relatively easy to implement and we might even want to provide this as a bundled extension (like ext/tokenizer). 
- 
-More interestingly, we could allow extensions to hook into the compilation process (the current AST implementation does not provide hooks, but they can be added if we want them). This would allow extensions to implement some types of "language features". 
- 
-As an example, this is roughly how an implementation of the [[rfc:ifsetor|ifsetor RFC]] //could// look like using an AST hook: 
- 
-<code c> 
-/* Works by rewriting ifsetor($foo, 'bar') to isset($foo) ? $foo : 'bar' */ 
-void ext_ifsetor_hook(zend_ast **ast_ptr TSRMLS_DC) { 
-    zend_ast *ast = *ast_ptr; 
-     
-    if (ast->kind == ZEND_AST_CALL && ast->child[0]->kind == ZEND_AST_ZVAL) { 
-        zend_string *name = zval_get_string(zend_ast_get_zval(ast->child[0])); 
-        zend_ast_list *args = zend_ast_get_list(ast->child[1]); 
-         
-        if (zend_str_equals_literal_ci(name, "ifsetor") 
-            && args->children == 2 && !zend_args_contain_unpack(args) 
-        ) { 
-            if (!zend_is_variable(args->child[0])) { 
-                zend_error_noreturn(E_COMPILE_ERROR, "First argument of ifsetor " 
-                    "must be a variable"); 
-            } 
-             
-            /* Note: One would need a function for adding refs to args->child[0] here, 
-             * as it is used two times - as written here it won't work correctly. */ 
-            *ast_ptr = zend_ast_create(ZEND_AST_CONDITIONAL, 
-                zend_ast_create(ZEND_AST_ISSET, args->child[0]), 
-                args->child[0], 
-                args->child[1] 
-            ); 
-        } 
-         
-        STR_RELEASE(name); 
-    } 
-} 
-</code> 
- 
-I don't know how useful this is and how many things can be implemented in this way, but I think it's worth considering. 
- 
-An additional possibility is to drop the keywords for ''isset'' and ''empty'' and just compile them as special function calls (using similar checks as the code above). Maybe other keywords can be dropped as well. 
- 
-===== Patch ===== 
- 
-The AST implementation can be found at https://github.com/nikic/php-src/tree/ast. Some quick links to the most important files: 
- 
-  * ''[[https://github.com/nikic/php-src/blob/ast/Zend/zend_language_parser.y|zend_language_parser.y]]'' 
-  * ''[[https://github.com/nikic/php-src/blob/ast/Zend/zend_ast.h|zend_ast.h]]'' 
-  * ''[[https://github.com/nikic/php-src/blob/ast/Zend/zend_ast.c|zend_ast.c]]'' 
-  * ''[[https://github.com/nikic/php-src/blob/ast/Zend/zend_compile.c|zend_compile.c]]'' (The code relevant to the AST begins somewhere around line 3200, everything before that is largely untouched.) 
- 
-*The branch already includes the Uniform Variable Syntax RFC*, as it was a necessary prerequisite for the implementation. 
- 
-The implementation has everything ported, but probably still has some bugs and needs some cleanup :) 
- 
-===== Vote ===== 
- 
-This RFC requires a 2/3 majority for acceptance. The vote started on 2014-08-18 and ends on 2014-08-25. 
- 
-<doodle title="Use AST implementation in PHP 7?" auth="nikic" voteType="single" closed="false"> 
-   * Yes 
-   * No 
-</doodle> 
  
 ===== Implementation ===== ===== Implementation =====
Line 551: Line 489:
  
 The ''zend_begin_loop'' and ''zend_end_loop'' functions store information for break/continue and try/catch. The ''zend_begin_loop'' and ''zend_end_loop'' functions store information for break/continue and try/catch.
 +
 +===== Additional possibilities (not implemented) =====
 +
 +The generated AST can be exposed to userland via an extension, for use by static analysers. This should be relatively easy to implement and we might even want to provide this as a bundled extension (like ext/tokenizer).
 +
 +More interestingly, we could allow extensions to hook into the compilation process (the current AST implementation does not provide hooks, but they can be added if we want them). This would allow extensions to implement some types of "language features".
 +
 +As an example, this is roughly how an implementation of the [[rfc:ifsetor|ifsetor RFC]] //could// look like using an AST hook:
 +
 +<code c>
 +/* Works by rewriting ifsetor($foo, 'bar') to isset($foo) ? $foo : 'bar' */
 +void ext_ifsetor_hook(zend_ast **ast_ptr TSRMLS_DC) {
 +    zend_ast *ast = *ast_ptr;
 +    
 +    if (ast->kind == ZEND_AST_CALL && ast->child[0]->kind == ZEND_AST_ZVAL) {
 +        zend_string *name = zval_get_string(zend_ast_get_zval(ast->child[0]));
 +        zend_ast_list *args = zend_ast_get_list(ast->child[1]);
 +        
 +        if (zend_str_equals_literal_ci(name, "ifsetor")
 +            && args->children == 2 && !zend_args_contain_unpack(args)
 +        ) {
 +            if (!zend_is_variable(args->child[0])) {
 +                zend_error_noreturn(E_COMPILE_ERROR, "First argument of ifsetor "
 +                    "must be a variable");
 +            }
 +            
 +            /* Note: One would need a function for adding refs to args->child[0] here,
 +             * as it is used two times - as written here it won't work correctly. */
 +            *ast_ptr = zend_ast_create(ZEND_AST_CONDITIONAL,
 +                zend_ast_create(ZEND_AST_ISSET, args->child[0]),
 +                args->child[0],
 +                args->child[1]
 +            );
 +        }
 +        
 +        STR_RELEASE(name);
 +    }
 +}
 +</code>
 +
 +I don't know how useful this is and how many things can be implemented in this way, but I think it's worth considering.
 +
 +An additional possibility is to drop the keywords for ''isset'' and ''empty'' and just compile them as special function calls (using similar checks as the code above). Maybe other keywords can be dropped as well.
 +
 +===== Patch =====
 +
 +The AST implementation can be found at https://github.com/nikic/php-src/tree/ast. Some quick links to the most important files:
 +
 +  * ''[[https://github.com/nikic/php-src/blob/ast/Zend/zend_language_parser.y|zend_language_parser.y]]''
 +  * ''[[https://github.com/nikic/php-src/blob/ast/Zend/zend_ast.h|zend_ast.h]]''
 +  * ''[[https://github.com/nikic/php-src/blob/ast/Zend/zend_ast.c|zend_ast.c]]''
 +  * ''[[https://github.com/nikic/php-src/blob/ast/Zend/zend_compile.c|zend_compile.c]]'' (The code relevant to the AST begins somewhere around line 3200, everything before that is largely untouched.)
 +
 +//The branch already includes the Uniform Variable Syntax RFC//, as it was a necessary prerequisite for the implementation.
 +
 +The implementation has everything ported, but probably still has some bugs and needs some cleanup :)
 +
 +===== Vote =====
 +The vote started on 2014-08-18 and ended on 2014-08-25. The necessary 2/3 majority was reached, as such the RFC is accepted.
 +
 +<doodle title="Use AST implementation in PHP 7?" auth="nikic" voteType="single" closed="true">
 +   * Yes
 +   * No
 +</doodle>
rfc/abstract_syntax_tree.1408379640.txt.gz · Last modified: 2017/09/22 13:28 (external edit)