rfc:abstract_syntax_tree

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
rfc:abstract_syntax_tree [2014/08/18 10:10] nikicrfc:abstract_syntax_tree [2017/09/22 13:28] (current) – external edit 127.0.0.1
Line 2: Line 2:
   * Date: 2014-07-28   * Date: 2014-07-28
   * Author: Nikita Popov <nikic@php.net>   * Author: Nikita Popov <nikic@php.net>
-  * Status: Draft +  * Status: Implemented (in PHP 7) 
-  * TargetingPHP.next+  * Discussionhttp://markmail.org/message/br4ixewsnqitrx3n
  
 ===== Introduction ===== ===== Introduction =====
Line 95: Line 95:
  
 ==== Changes to list() ==== ==== Changes to list() ====
 +
 +> **Note**: The behavior of ''list($a, $b) = $a'' described below no longer applies. After this RFC was accepted ''list()'' assignments that contain the same variable on the left- and right-hand side have been special cased to ensure the right-hand side always evaluates first. This means that ''list($a, $b) = $a'' continues working as expected.
  
 ''list()'' currently assigns variables right-to-left, the AST implementation will assign them left-to-right instead: ''list()'' currently assigns variables right-to-left, the AST implementation will assign them left-to-right instead:
Line 147: Line 149:
  
 ==== Auto-vivification order for by-reference assignments ==== ==== Auto-vivification order for by-reference assignments ====
 +
 +> **Note**: The auto-vivification order for reference assignments has been restored to the old behavior in PHP 7.1. The reason for this is that we found hard to avoid memory safety issues with the new order.
  
 While by-reference assignments are (CVs notwithstanding) evaluated left-to-right, auto-vivification currently occurs right-to-left. In the AST implementation this will happen left-to-right instead: While by-reference assignments are (CVs notwithstanding) evaluated left-to-right, auto-vivification currently occurs right-to-left. In the AST implementation this will happen left-to-right instead:
Line 178: Line 182:
  
 Doing calls like ''%%$obj->__clone()%%'' is now allowed. This was the only magic method that had a compile-time check preventing some calls to it, which doesn't make sense. If we allow all other magic methods to be called, there's no reason to forbid this one. Doing calls like ''%%$obj->__clone()%%'' is now allowed. This was the only magic method that had a compile-time check preventing some calls to it, which doesn't make sense. If we allow all other magic methods to be called, there's no reason to forbid this one.
- 
-===== Additional possibilities (not implemented) ===== 
- 
-The generated AST can be exposed to userland via an extension, for use by static analysers. This should be relatively easy to implement and we might even want to provide this as a bundled extension (like ext/tokenizer). 
- 
-More interestingly, we could allow extensions to hook into the compilation process (the current AST implementation does not provide hooks, but they can be added if we want them). This would allow extensions to implement some types of "language features". 
- 
-As an example, this is roughly how an implementation of the [[rfc:ifsetor|ifsetor RFC]] //could// look like using an AST hook: 
- 
-<code c> 
-/* Works by rewriting ifsetor($foo, 'bar') to isset($foo) ? $foo : 'bar' */ 
-void ext_ifsetor_hook(zend_ast **ast_ptr TSRMLS_DC) { 
-    zend_ast *ast = *ast_ptr; 
-     
-    if (ast->kind == ZEND_AST_CALL && ast->child[0]->kind == ZEND_AST_ZVAL) { 
-        zend_string *name = zval_get_string(zend_ast_get_zval(ast->child[0])); 
-        zend_ast_list *args = zend_ast_get_list(ast->child[1]); 
-         
-        if (zend_str_equals_literal_ci(name, "ifsetor") 
-            && args->children == 2 && !zend_args_contain_unpack(args) 
-        ) { 
-            if (!zend_is_variable(args->child[0])) { 
-                zend_error_noreturn(E_COMPILE_ERROR, "First argument of ifsetor " 
-                    "must be a variable"); 
-            } 
-             
-            /* Note: One would need a function for adding refs to args->child[0] here, 
-             * as it is used two times - as written here it won't work correctly. */ 
-            *ast_ptr = zend_ast_create(ZEND_AST_CONDITIONAL, 
-                zend_ast_create(ZEND_AST_ISSET, args->child[0]), 
-                args->child[0], 
-                args->child[1] 
-            ); 
-        } 
-         
-        STR_RELEASE(name); 
-    } 
-} 
-</code> 
- 
-I don't know how useful this is and how many things can be implemented in this way, but I think it's worth considering. 
- 
-An additional possibility is to drop the keywords for ''isset'' and ''empty'' and just compile them as special function calls (using similar checks as the code above). Maybe other keywords can be dropped as well. 
- 
-===== Patch ===== 
- 
-The AST implementation can be found at https://github.com/nikic/php-src/tree/ast. Some quick links to the most important files: 
- 
-  * ''[[https://github.com/nikic/php-src/blob/ast/Zend/zend_language_parser.y|zend_language_parser.y]]'' 
-  * ''[[https://github.com/nikic/php-src/blob/ast/Zend/zend_ast.h|zend_ast.h]]'' 
-  * ''[[https://github.com/nikic/php-src/blob/ast/Zend/zend_ast.c|zend_ast.c]]'' 
-  * ''[[https://github.com/nikic/php-src/blob/ast/Zend/zend_compile.c|zend_compile.c]]'' (The code relevant to the AST begins somewhere around line 3200, everything before that is largely untouched.) 
- 
-The branch is based on phpng and already includes the Uniform Variable Syntax RFC, as it was a necessary prerequisite for the implementation. 
- 
-The implementation has everything ported, but probably still has some bugs and needs some cleanup :) 
  
 ===== Implementation ===== ===== Implementation =====
Line 541: Line 489:
  
 The ''zend_begin_loop'' and ''zend_end_loop'' functions store information for break/continue and try/catch. The ''zend_begin_loop'' and ''zend_end_loop'' functions store information for break/continue and try/catch.
 +
 +===== Additional possibilities (not implemented) =====
 +
 +The generated AST can be exposed to userland via an extension, for use by static analysers. This should be relatively easy to implement and we might even want to provide this as a bundled extension (like ext/tokenizer).
 +
 +More interestingly, we could allow extensions to hook into the compilation process (the current AST implementation does not provide hooks, but they can be added if we want them). This would allow extensions to implement some types of "language features".
 +
 +As an example, this is roughly how an implementation of the [[rfc:ifsetor|ifsetor RFC]] //could// look like using an AST hook:
 +
 +<code c>
 +/* Works by rewriting ifsetor($foo, 'bar') to isset($foo) ? $foo : 'bar' */
 +void ext_ifsetor_hook(zend_ast **ast_ptr TSRMLS_DC) {
 +    zend_ast *ast = *ast_ptr;
 +    
 +    if (ast->kind == ZEND_AST_CALL && ast->child[0]->kind == ZEND_AST_ZVAL) {
 +        zend_string *name = zval_get_string(zend_ast_get_zval(ast->child[0]));
 +        zend_ast_list *args = zend_ast_get_list(ast->child[1]);
 +        
 +        if (zend_str_equals_literal_ci(name, "ifsetor")
 +            && args->children == 2 && !zend_args_contain_unpack(args)
 +        ) {
 +            if (!zend_is_variable(args->child[0])) {
 +                zend_error_noreturn(E_COMPILE_ERROR, "First argument of ifsetor "
 +                    "must be a variable");
 +            }
 +            
 +            /* Note: One would need a function for adding refs to args->child[0] here,
 +             * as it is used two times - as written here it won't work correctly. */
 +            *ast_ptr = zend_ast_create(ZEND_AST_CONDITIONAL,
 +                zend_ast_create(ZEND_AST_ISSET, args->child[0]),
 +                args->child[0],
 +                args->child[1]
 +            );
 +        }
 +        
 +        STR_RELEASE(name);
 +    }
 +}
 +</code>
 +
 +I don't know how useful this is and how many things can be implemented in this way, but I think it's worth considering.
 +
 +An additional possibility is to drop the keywords for ''isset'' and ''empty'' and just compile them as special function calls (using similar checks as the code above). Maybe other keywords can be dropped as well.
 +
 +===== Patch =====
 +
 +The AST implementation can be found at https://github.com/nikic/php-src/tree/ast. Some quick links to the most important files:
 +
 +  * ''[[https://github.com/nikic/php-src/blob/ast/Zend/zend_language_parser.y|zend_language_parser.y]]''
 +  * ''[[https://github.com/nikic/php-src/blob/ast/Zend/zend_ast.h|zend_ast.h]]''
 +  * ''[[https://github.com/nikic/php-src/blob/ast/Zend/zend_ast.c|zend_ast.c]]''
 +  * ''[[https://github.com/nikic/php-src/blob/ast/Zend/zend_compile.c|zend_compile.c]]'' (The code relevant to the AST begins somewhere around line 3200, everything before that is largely untouched.)
 +
 +//The branch already includes the Uniform Variable Syntax RFC//, as it was a necessary prerequisite for the implementation.
 +
 +The implementation has everything ported, but probably still has some bugs and needs some cleanup :)
 +
 +===== Vote =====
 +The vote started on 2014-08-18 and ended on 2014-08-25. The necessary 2/3 majority was reached, as such the RFC is accepted.
 +
 +<doodle title="Use AST implementation in PHP 7?" auth="nikic" voteType="single" closed="true">
 +   * Yes
 +   * No
 +</doodle>
rfc/abstract_syntax_tree.1408356630.txt.gz · Last modified: 2017/09/22 13:28 (external edit)