rfc:context_sensitive_lexer
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
rfc:context_sensitive_lexer [2015/02/20 07:09] – marcio | rfc:context_sensitive_lexer [2017/09/22 13:28] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== PHP RFC: Context Sensitive Lexer ====== | ====== PHP RFC: Context Sensitive Lexer ====== | ||
- | * Version: 0.3 | + | * Version: 0.4.1 |
* Date: 2015-02-15 | * Date: 2015-02-15 | ||
* Author: Márcio Almada | * Author: Márcio Almada | ||
- | * Status: | + | * Status: |
* First Published at: http:// | * First Published at: http:// | ||
Line 18: | Line 18: | ||
class Collection { | class Collection { | ||
public function forEach(callable $callback) { /* */ } | public function forEach(callable $callback) { /* */ } | ||
+ | public function list() { /* */ } | ||
} | } | ||
- | class List { | ||
- | public function append(List $list) { /* */ } | ||
- | } | ||
</ | </ | ||
- | Notice that it's currently **not** possible to have the '' | + | Notice that it's currently **not** possible to have the '' |
PHP Parse error: Syntax error, unexpected T_FOREACH, expecting T_STRING on line 2 | PHP Parse error: Syntax error, unexpected T_FOREACH, expecting T_STRING on line 2 | ||
- | PHP Parse error: Syntax error, unexpected T_LIST, expecting T_STRING on line 5 | + | PHP Parse error: Syntax error, unexpected T_LIST, expecting T_STRING on line 3 |
===== Proposal ===== | ===== Proposal ===== | ||
This RFC revisits the topic of [[https:// | This RFC revisits the topic of [[https:// | ||
- | presenting a minimal and maintainable [[https:// | + | presenting a minimal and maintainable [[https:// |
- | restricted to OO scope only, consistently comprehending: | + | |
- | * Namespace, class, trait and interface names | ||
* Properties, constants and methods defined on classes, interfaces and traits | * Properties, constants and methods defined on classes, interfaces and traits | ||
* Access of properties, constants and methods from objects and classes | * Access of properties, constants and methods from objects and classes | ||
Line 43: | Line 39: | ||
- Reduce the surface of BC breaks whenever new keywords are introduced | - Reduce the surface of BC breaks whenever new keywords are introduced | ||
- | - Avoid restricting userland APIs. Dispensing the need for hacks like unecessary | + | - Avoid restricting userland APIs. Dispensing the need for hacks like unnecessary |
This is a list of currently **globally** reserved words that will become **semi-reserved** in case proposed change gets approved: | This is a list of currently **globally** reserved words that will become **semi-reserved** in case proposed change gets approved: | ||
Line 51: | Line 47: | ||
namespace | namespace | ||
print echo require | print echo require | ||
- | function | + | function |
==== Limitations ==== | ==== Limitations ==== | ||
- | On purporse, it's still forbidden to define a **namespace**, | + | On purpose, it's still forbidden to define a **class |
- | + | ||
- | * '' | + | |
- | * '' | + | |
- | * '' | + | |
- | * '' | + | |
- | * '' | + | |
- | * '' | + | |
<code php> | <code php> | ||
- | namespace|class|interface|trait Namespace | + | class Foo { |
- | namespace|class|interface|trait Self {} // Fatal error | + | const class = ' |
- | namespace|class|interface|trait Static {} // Fatal error | + | } |
- | namespace|class|interface|trait Parent {} // Fatal error | + | |
- | namespace|class|interface|trait Array {} // Fatal error | + | |
- | namespace|class|interface|trait Callable {} // Fatal error | + | |
- | // Fatal error: Cannot | + | // Fatal error: Cannot |
</ | </ | ||
- | On purporse, it's still forbidden | + | In practice, it means that we would drop from **64** |
+ | |||
+ | '' | ||
<code php> | <code php> | ||
class Foo { | class Foo { | ||
- | | + | |
} | } | ||
- | // Fatal error: Cannot redefine | + | (new Foo)->list; |
</ | </ | ||
Line 154: | Line 142: | ||
} | } | ||
</ | </ | ||
+ | |||
+ | ===== Impact On Other RFCs ===== | ||
+ | |||
+ | Some RFCs are proposing to reserve new keywords in order to add features or reserve typehints names: | ||
+ | |||
+ | * https:// | ||
+ | * https:// | ||
+ | * https:// | ||
+ | |||
+ | With the approval of the current RFC, BC breaks surface would be much smaller in such cases. | ||
+ | |||
+ | One notable example is the **in** operator RFC. Without a context sensitive lexer, proposed here, the new operator would create a BC break on **Doctrine** library and pretty much many other SQL writers or ORMs out there: | ||
+ | |||
+ | https:// | ||
===== Implementation Details ===== | ===== Implementation Details ===== | ||
+ | |||
+ | ==== Patch 1 - Discarded ==== | ||
The lexer now keeps track of the context needed to have unreserved words on OO scope and makes use of a minimal amount of RE2C lookahead capabilities when disambiguation becomes inevitable. | The lexer now keeps track of the context needed to have unreserved words on OO scope and makes use of a minimal amount of RE2C lookahead capabilities when disambiguation becomes inevitable. | ||
- | For instance, the lexing rules to disambiguate '':: | + | For instance, the lexing rules to disambiguate '':: |
<code c++> | <code c++> | ||
Line 172: | Line 176: | ||
</ | </ | ||
- | One additional compile time check was created: | + | A few additional compile time check were created: |
<code c> | <code c> | ||
- | if (zend_string_equals_literal_ci(name, " | + | if(ZEND_NOT_RESERVED != zend_check_reserved_method_name(decl->name)) { |
- | zend_error_noreturn(E_COMPILE_ERROR, | + | zend_error_noreturn(E_COMPILE_ERROR, |
- | ce->name-> | + | |
} | } | ||
</ | </ | ||
- | Others were just adapted because, surprisingly, | + | ==== Patch 2 ==== |
- | adjustments | + | |
+ | A new patch has been added during the voting phase. It's a different approach that proved to have many advantages over the first patch and therefore it is intended to supersede it. | ||
+ | |||
+ | The new patch just requires the maintenance | ||
+ | |||
+ | - It offers no regression | forward compatibility risks and is highly predictable | ||
+ | - It has a very small footprint when compared to the previous attempt involving a pure lexical approach | ||
+ | - Requires no compile time checks | ||
+ | - Is highly configurable, | ||
+ | |||
+ | In order to send information to the lexer about the context change, we just have to use '' | ||
<code c> | <code c> | ||
// before | // before | ||
- | if(ZEND_FETCH_CLASS_DEFAULT != zend_get_class_fetch_type(name)) { | + | method_modifiers function returns_ref T_STRING '(' |
- | zend_error_noreturn(E_COMPILE_ERROR, | + | |
- | } | + | |
// after | // after | ||
- | if(ZEND_FETCH_CLASS_DEFAULT != zend_check_reserved_name(name)) { | + | method_modifiers function returns_ref identifier '(' |
- | zend_error_noreturn(E_COMPILE_ERROR, | + | |
- | } | + | |
</ | </ | ||
- | Current proposed patch: | + | ===== Future Work And Maintenance ===== |
- | * Doesn' | + | * All php-src tests are passing |
- | * Keeps ext tokenizer functional | + | |
- | * Introduces no maintenance issues | + | |
- | * Has no performance impact | + | |
- | * Introduces | + | |
- | => Many experiments with parsing were done before the current proposed | + | => The first patch has been discarded during discussion on voting phase. It was considered |
- | => No performance loss was noticed but maybe the patch requires a better benchmark. | + | |
- | + | ||
- | ===== Impact on performance ===== | + | |
- | + | ||
- | No loss noticed. | + | |
- | + | ||
- | -- Add benchmark here if asked on discussion phase. -- | + | |
===== Proposed PHP Version(s) ===== | ===== Proposed PHP Version(s) ===== | ||
Line 216: | Line 216: | ||
This is proposed for the next PHP x, which at the time of this writing would be PHP 7. | This is proposed for the next PHP x, which at the time of this writing would be PHP 7. | ||
- | ===== Open Issues | + | ===== Votes ===== |
- | The patch may still contain small bugs related to the topics below, but this can be addressed during discussion phase: | + | This voting requires a 2/3 majority. |
+ | considered good enough, independently of the voting results. The RCF author encourages voting for the feature. | ||
- | * I still have to add more tests involving traits and trait conflict resolution syntax | + | <doodle title=" |
- | * I still have to add more tests involving '' | + | * Yes |
+ | | ||
+ | </ | ||
- | The patch is 98% implemented | + | Voting started on 2015-02-28 |
- | finishing these small details without impact | + | |
===== Patch ===== | ===== Patch ===== | ||
- | | + | ==== Patch 1 - Discarded ==== |
- | - Pull request with all the tests and regenerated ext tokenizer | + | |
+ | - Pull request with all the tests and regenerated ext tokenizer | ||
+ | |||
+ | ==== Patch 2 ==== | ||
+ | |||
+ | - Pull request with all the tests is at [[https:// | ||
+ | |||
+ | ==== Later Changes === | ||
+ | |||
+ | The *Patch 2* was merged and, later, method modifiers were allowed as class member names. This was a limitation from the older implementation candidate - Patch 1 - and there was no reason to keep it. The **Limitations** section was updated accordingly. Only the keyword **class** for class constants is reserved now. | ||
===== References ===== | ===== References ===== | ||
Line 237: | Line 248: | ||
===== Rejected Features ===== | ===== Rejected Features ===== | ||
- | None so far. | + | * Prior to voting, the support for '' |
+ | |||
+ | => The RFC author will try to solve the wider problem on PHP 7.1 | ||
===== Changelog ===== | ===== Changelog ===== | ||
Line 243: | Line 256: | ||
* 0.2: Additional support to namespaces, classes, interafces and traits names | * 0.2: Additional support to namespaces, classes, interafces and traits names | ||
* 0.3: Oops. Add forgotten support for typehints | * 0.3: Oops. Add forgotten support for typehints | ||
+ | * 0.4: Reverts to 0.1 feature set because class name support created undesired situations regarding the future addition of a future short lambda syntax and possibly block other language changes. | ||
+ | * 0.4.1: A new compatible implementation has been introduced | ||
===== Acknowledgements ===== | ===== Acknowledgements ===== |
rfc/context_sensitive_lexer.1424416176.txt.gz · Last modified: 2017/09/22 13:28 (external edit)