rfc:parser-extension-api
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
rfc:parser-extension-api [2015/02/23 06:36] – lisachenko | rfc:parser-extension-api [2025/04/03 13:08] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== PHP RFC: Parser Extension API ====== | ====== PHP RFC: Parser Extension API ====== | ||
- | * Version: 0.1 | + | * Version: 0.2 |
* Date: 2015-02-17 | * Date: 2015-02-17 | ||
- | * Author: Nikita Popov nikic@php.net, | + | * Author: Nikita Popov <nikic@php.net>, Alexander Lisachenko |
- | * Status: | + | * Status: |
* Related to: https:// | * Related to: https:// | ||
Line 9: | Line 9: | ||
===== Introduction ===== | ===== Introduction ===== | ||
This RFC proposes an introduction of userland parser extensions API for providing an access to the low-level Abstract Syntax Tree (AST) parser. | This RFC proposes an introduction of userland parser extensions API for providing an access to the low-level Abstract Syntax Tree (AST) parser. | ||
- | |||
- | RFC consists of two parts: | ||
- | * parsing API that provides an AST for a given string of source code | ||
- | * extension API that allows to register custom PHP hooks from userland to modify an Abstract Syntax Tree before transforming it into concrete opcodes. | ||
===== Parsing API proposal ===== | ===== Parsing API proposal ===== | ||
Line 45: | Line 41: | ||
/** | /** | ||
* Returns the text representation of current node | * Returns the text representation of current node | ||
+ | * Recursively applied to all children | ||
* | * | ||
* @return string | * @return string | ||
Line 68: | Line 65: | ||
The `flags` property contains node specific flags. It is always defined, but for most nodes it is always zero. isUsingFlags() method for node can be used to determine whether a node has a meaningful flags value. | The `flags` property contains node specific flags. It is always defined, but for most nodes it is always zero. isUsingFlags() method for node can be used to determine whether a node has a meaningful flags value. | ||
+ | The `value` property contains a value only from zval AST nodes. | ||
The `lineNumber` property specified the starting line number of the node. | The `lineNumber` property specified the starting line number of the node. | ||
The `children` property contains an array of child-nodes. | The `children` property contains an array of child-nodes. | ||
Line 80: | Line 78: | ||
{ | { | ||
/** | /** | ||
- | * Parses given code and returns an AST for it | + | * Parses |
+ | * | ||
+ | * @param string $phpCode Source code to analyse | ||
* | * | ||
* @return Node | * @return Node | ||
Line 88: | Line 88: | ||
</ | </ | ||
- | The static Engine:: | + | The static Engine:: |
Here is an example of getting an AST for simple code: | Here is an example of getting an AST for simple code: | ||
Line 117: | Line 117: | ||
This information about AST can be used later for custom Parser Extensions, QA static analysis tools, source code rewriting tools and much more. | This information about AST can be used later for custom Parser Extensions, QA static analysis tools, source code rewriting tools and much more. | ||
- | I want to notice, that this part was originally implemented and described by Nikita Popov as an experimental php-ast extension https:// | + | I want to notice, that this part was originally implemented and described by Nikita Popov as an experimental php-ast extension https:// |
- | ==== Parser Extension API ==== | ||
- | |||
- | Second part of this RFC proposes to add an API for building userland parser extensions. We could allow userland extensions to hook into the compilation process. This would allow extensions to implement some types of language features, for example, Design-by-Contract verifying, Aspect-Oriented programming, | ||
- | |||
- | Userland parser extension is described by the extension interface with single `process` method that accepts only one argument - top-level AST node and can modify it. | ||
- | |||
- | <code php> | ||
- | <?php | ||
- | namespace Php\Parser; | ||
- | |||
- | interface ExtensionInterface { | ||
- | |||
- | /** | ||
- | * Receives a top-level node of AST and can transform it | ||
- | */ | ||
- | public static function process(Node $node); | ||
- | } | ||
- | </ | ||
- | |||
- | Each extension can be registered or unregistered in the parser engine class by calling appropriate methods: | ||
- | |||
- | <code php> | ||
- | <?php | ||
- | namespace Php\Parser; | ||
- | |||
- | class Engine { | ||
- | |||
- | /** | ||
- | * @var array|string[] List of parser extension classes | ||
- | */ | ||
- | private static $extensions; | ||
- | | ||
- | /** | ||
- | * Register an extension class in the parser | ||
- | * @param string $extensionClassName Name of the extension class | ||
- | */ | ||
- | public static function registerExtension($extensionClassName) | ||
- | | ||
- | /** | ||
- | * Unregister an extension class from the parser | ||
- | * @param string $extensionClassName Name of the extension class | ||
- | */ | ||
- | public static function unregisterExtension($extensionClassName) | ||
- | | ||
- | /** | ||
- | * Returns a list of currently registered extensions | ||
- | * @return string[]|array List of registered extensions | ||
- | */ | ||
- | public static function getRegisteredExtensions() | ||
- | } | ||
- | </ | ||
- | |||
- | Here is an example with pseudo-code of possible usage for implementing Design-By-Contract paradigm: | ||
- | |||
- | <code php> | ||
- | <?php | ||
- | |||
- | use Php\Parser\Node; | ||
- | use Php\Parser\Engine as ParserEngine; | ||
- | use Php\Parser\ExtensionInterface; | ||
- | |||
- | class DbcParserExtension implements ExtensionInterface | ||
- | { | ||
- | public static function process(Node $node) | ||
- | { | ||
- | // prepare an AST to insert, this can be annotation or anything else | ||
- | $astToInsert = ParserEngine:: | ||
- | | ||
- | // node visitor, that will traverse the AST for specific nodes | ||
- | $methodNodeVisitor = new NodeVisitor($node, | ||
- | $methodNodeVisitor-> | ||
- | // Insert our AST code before original method statements | ||
- | $node-> | ||
- | }); | ||
- | } | ||
- | } | ||
- | |||
- | // Registration of extension | ||
- | ParserEngine:: | ||
- | |||
- | // Now every include/ | ||
- | include ' | ||
- | |||
- | // We can alos parse a code directly with parser, hook will be called too: | ||
- | ParserEngine:: | ||
- | </ | ||
- | |||
- | ==== General flow of compiling the source code and limitations ==== | ||
- | |||
- | Current flow (as of PHP7) of running PHP code can be represented as following: | ||
- | |||
- | '' | ||
- | |||
- | Fist step is lexing (or tokenization) of source code into separate tokens. After that Abstract Syntax Tree is generated by the parser, based on token stream and PHP Grammar. This AST is used for producing concrete opcodes for each node. More details available at [[rfc:: | ||
- | |||
- | After implementation of this RFC, general flow will be changed in the following way: | ||
- | |||
- | '' | ||
- | |||
- | Note, that Parser Extension hooks are executing before generating of opcodes, so hooks will be typically called only once per each file, because of opcode cachers. This is considered as limitation of parser extensions, they can receive an AST for file only once, so no dynamic AST transformations are allowed, because all opcodes are fetched directly by file name: | ||
- | |||
- | '' | ||
- | |||
- | |||
- | ==== Impact on performance ==== | ||
- | |||
- | Without registered parser extensions there is no impact on runtime performance, | ||
===== Backward Incompatible Changes ===== | ===== Backward Incompatible Changes ===== | ||
Line 241: | Line 134: | ||
==== To Opcache ==== | ==== To Opcache ==== | ||
- | This RFC doesn' | + | This RFC doesn' |
==== New Constants ==== | ==== New Constants ==== | ||
Line 247: | Line 140: | ||
==== php.ini Defaults ==== | ==== php.ini Defaults ==== | ||
- | Currently no | + | No |
===== Open Issues ===== | ===== Open Issues ===== | ||
- | - Need to clarify/ | + | - Should each node type be represented as personal |
- | - Inclusion of `NodeVisitor` | + | - Where metadata should be stored (flags, names of kind nodes, relation between node types)? This information will be needed later for validation of AST |
- | - Should voting include 2/3 majority or simple 50%+1 vote? | + | |
===== Future Scope ===== | ===== Future Scope ===== | ||
- | This implementation of RFC can be used later for building | + | This implementation of RFC can be used later for building |
===== Proposed Voting Choices ===== | ===== Proposed Voting Choices ===== | ||
- | Not decided yet. | + | Target version: |
+ | - 7.0 | ||
+ | - 7.x | ||
+ | - Do not include this API into core | ||
+ | |||
+ | Implementation paradigm: | ||
+ | - Object-oriented: | ||
+ | - Functional: ast_xxx() functions | ||
+ | Namespace: | ||
+ | - None (top-level) | ||
+ | - Php\Parser\ | ||
+ | - Ast\ | ||
===== Patches and Tests ===== | ===== Patches and Tests ===== | ||
Line 270: | Line 173: | ||
- PHP RFC: Abstract syntax tree https:// | - PHP RFC: Abstract syntax tree https:// | ||
- PHP-AST Extension https:// | - PHP-AST Extension https:// | ||
+ | - Compiler hook for altering the AST pre-compilation https:// | ||
- Userland PHP Parser https:// | - Userland PHP Parser https:// | ||
- Userland PHP-Token-Reflection https:// | - Userland PHP-Token-Reflection https:// | ||
+ | - Abstract Syntax Trees API in Python Language https:// | ||
===== Rejected Features ===== | ===== Rejected Features ===== | ||
- | None | + | - Userland parser extensions - need more time to clarify details, possible targets are 7.x or 8.0 |
rfc/parser-extension-api.1424673377.txt.gz · Last modified: 2025/04/03 13:08 (external edit)