rfc:namespaced_names_as_token

This is an old revision of the document!


PHP RFC: Treat namespaced names as single token

Introduction

PHP currently treats namespaced names like Foo\Bar as a sequence of identifier and namespace separator tokens. This RFC proposed to treat namespaced names as a single token, and as such allow reserved keywords to appear inside them. At the same time, it proposes to lift reserved keyword restrictions for class, function and constant declarations.

The motivation is to reduce the backwards compatibility impact of new reserved keyword additions in future versions of PHP. To give a specific example, PHP 7.4 added the fn keyword as part of arrow function support. This broke my iter library, because it was using fn as part of a namespace name. However, this breakage was entirely unnecessary! Here is a typical usage example:

// In the library:
namespace iter\fn;
 
function operator($operator, $operand = null) { ... }
 
// In the using code:
use iter\fn;
 
iter\map(fn\operator('*', 2), $nums);

As you can see, both references of fn are part of a namespaced name: iter\fn and fn\operator. Under this proposal, these are considered perfectly legal names, and the backwards compatibility break would not have occurred.

To provide another example, what would happen if enum become a reserved keyword in a future PHP version? There are a number of userland implementations that use the Enum class name and would break. After this proposal, they would still break, but in a way that can be salvaged and migrated more easily:

// This line would remain legal
use MyCLabs\Enum\Enum;
 
// This line would still result in a parse error
class Action extends Enum { }

This means that code using the enum library will very likely break, but there are easy ways to avoid it, possibly by automated means:

class Action extends \MyCLabs\Enum\Enum { }
// or
use MyCLabs\Enum\Enum as Enum_;
class Action extends Enum_ { }

This makes it possible to keep code working until it is migrated.

Proposal

Changes to namespaced names

PHP distinguishes four kinds of namespaced names:

  • Unqualified names like Foo, which coincide with identifiers.
  • Qualified names like Foo\Bar.
  • Fully qualified names like \Foo.
  • Namespace-relative names like namespace\Foo.

Each of these kinds will be represented by a distinct token:

Foo;
// Before: T_STRING
// After:  T_STRING
// Rule:   {LABEL}
 
Foo\Bar;
// Before: T_STRING T_NS_SEPARATOR T_STRING
// After:  T_NAME_QUALIFIED
// Rule:   {LABEL}("\\"{LABEL})+
 
\Foo;
// Before: T_NS_SEPARATOR T_STRING
// After:  T_NAME_FULLY_QUALIFIED
// Rule:   {LABEL}("\\"{LABEL})*
 
namespace\Foo;
// Before: T_NAMESPACE T_NS_SEPARATOR T_STRING
// After:  T_NAME_RELATIVE
// Rule:   "namespace"("\\"{LABEL})+

Individual namespace segments may contain reserved keywords:

// This is interpreted as T_LIST (i.e., as a reserved keyword):
List
// All of the following are interpreted as legal namespaced names:
\List
FooBar\List
namespace\List

Whitespace is not permitted between namespace separators. If it occurs, the namespace separator will be parsed as T_NS_SEPARATOR, which will subsequently lead to a parser error. It is not possible to allow whitespace, because namespaced names commonly occur next to keywords:

class Foo implements \Bar {}

If we permitted whitespace, implements \Bar would end up being interpreted as a namespaced name. It should be noted that while this change has the potential to break some code, it also prevents programming mistakes I have seen in the wild:

// This would have previously been interpreted as a namespace-relative name,
// which is an obscure PHP feature that few people know about. Now it will
// result in a parse error.
namespace \Foo;
 
// This would have previously been interpreted as $foo = Foo\call($bar),
// now it will result in a parser error.
$foo = Foo // <- Missing semicolon
\call($bar);

Changes to reserved keywords limitations

In additional to the namespaced name changes, reserved keywords may now be used in a number of additional places:

class KEYWORD {}
interface KEYWORD {}
trait KEYWORD {}
function KEYWORD() {}
const KEYWORD = 0;
use Foo as KEYWORD;

It should be emphasized that it will not be possible to refer to symbols that use keyword names by directly using the keyword, it always needs to be part of a namespaced name that renders its usage unambigous:

class List {}
 
new List;  // Parse error.
new \List; // Ok!

This also extends to function-like language constructs:

function print($arg) {}
 
print($arg); // This is the normal "print" language construct.
\print($arg) // This calls the print() function defined above.

To avoid confusion in this particular case, it would be possible to prevent defining function symbols that clash with function-like language constructs.

Backward Incompatible Changes

Existing code using whitespace (or comments) between namespace separators of namespaced names will now produce a parse error. An analysis of the top 2000 composer packages has found five occurrences of this issue:

sylius/sylius/src/Sylius/Bundle/ApiBundle/ApiPlatform/Metadata/Property/Factory/ExtractorPropertyMetadataFactory.php:109
    \ RuntimeException
api-platform/core/src/Metadata/Extractor/AbstractExtractor.php:121
    \ RuntimeException
mck89/peast/lib/Peast/Syntax/Node/JSX/JSXFragment.php:13
    Peast \Syntax\Node\Expression
mck89/peast/lib/Peast/Syntax/Node/JSX/JSXOpeningElement.php:13
    Peast \Syntax\Node\Expression
mck89/peast/lib/Peast/Syntax/Node/JSX/JSXElement.php:13
    Peast \Syntax\Node\Expression

As such, the practical impact is very limited, and any issues are trivial to fix. On the other hand, this change will reduce the backwards-compatibility impact from any future keyword additions.

Vote

Yes/No.

rfc/namespaced_names_as_token.1592240711.txt.gz · Last modified: 2020/06/15 17:05 by nikic