rfc:shorter_attribute_syntax

This is an old revision of the document!


PHP RFC: Shorter Attribute Syntax

Introduction

We propose to use @@Attr or #[Attr] as the attribute syntax in PHP 8.0 instead of <<Attr>>.

Background

Early last month, the Attributes v2 RFC was accepted for PHP 8.0. During voting there was a syntax choice between <<Attr>> and @:Attr, and the former was preferred by an approximately 3/4 majority.

However, this syntax has several noteworthy shortcomings:

1. Verbosity

In this example from the RFC, there are more characters used for the << and >> tokens than for the attribute name itself:

<<Jit>>
function foo() {}

Especially once more than one attribute needs to be added, the syntax can get quite noisy and difficult to read at a glance. To alleviate this, the Attribute Amendments RFC proposes to allow grouping multiple attributes between one pair of << and >> tokens:

<<Attr1("foo"),
  Attr2("bar")>>
public function test() {}

Unfortunately, this results in another problem: adding a new attribute at the start or end of the list, or removing one of the attributes, will require modifying multiple lines, which adds noise to diffs. Notably, the proposal does allow trailing commas, so the code could be written like this instead:

<<
  Attr1("foo"),
  Attr2("bar"),
>>
public function test() {}

But now two extra lines are needed for the << and >> tokens, and multiple lines will likely still be modified when moving from one attribute to two attributes, or vice versa.

2. Lack of nested attributes

Nested annotations are quite common in apps using Doctrine. Here's an example from the documentation:

@JoinTable(
    name="User_Group",
    joinColumns={@JoinColumn(name="User_id", referencedColumnName="id")},
    inverseJoinColumns={@JoinColumn(name="Group_id", referencedColumnName="id")}
)

While nested attributes with the <<>> syntax should technically be possible, there was general agreement among the implementers that this approach crosses a line of unacceptable ugliness. For example:

<<JoinTable(
    "User_Group",
    <<JoinColumn("User_id", "id")>>,
    <<JoinColumn("Group_id", "id")>>,
)>>
private $groups;

A somewhat more readable approach would be to allow new Attr() expressions for nested attributes:

<<JoinTable(
    "User_Group",
    new JoinColumn("User_id", "id"),
    new JoinColumn("Group_id", "id"),
)>>
private $groups;

The downside of this is that it can lead to confusion about which expressions are supported (e.g. new Foo() would be allowed but Foo::create() wouldn't work). Furthermore, it turned out to be very difficult to implement this, and the feature was ultimately given up on since it would require a lot of changes to const expressions.

3. Confusion with generics

Although there isn't technically a conflict between the syntax for attributes and generics, once generics are supported in PHP it may be harder to tell at a glance where generics are being used as opposed to attributes.

4. Confusion with shift operators and const arguments

Reusing shift tokens can make it difficult to tell if a line contains multiple attributes, or a shift operator in a constant expression:

<<Bar(2 * (3 + 3)>>Baz, (4 + 5) * 2)>>
function foo() {}

Of course, bit shifts will rarely be used in this context. However, for developers who are used to working with shift operators, reusing the same syntax to delineate attributes may still result in less readable code due to this association. Furthermore, if the grouped attribute proposal is accepted, there can be similar confusion between comma separated attributes and attribute arguments:

<<Attr1(2 * 3 + 3), Bar(4 + 5 * 2)>>
<<Attr2(2 * (3 + 3), Baz, (4 + 5) * 2)>>
function foo() {}

5. Dissimilarity to other languages

Most other languages in the C family use either [Attr] or @Attr for their attribute syntax, requiring just one or two characters, rather than four (see comparison further below).

Proposal

Use @@Attr instead of <<Attr>> for the attribute syntax in PHP 8.0 (using a new T_ATTRIBUTE token).

This solves each of the above issues. It requires half as many characters, reducing verbosity in line with most other languages that have attributes:

@@Jit
function foo() {}

The @@ syntax doesn't have any conflicts with nested attributes, so it will be straightforward to add support for them in the future if desired (without needing any special cases or changes to const expressions):

@@JoinTable(
    "User_Group",
    @@JoinColumn("User_id", "id"),
    @@JoinColumn("Group_id", "id"),
)
private $groups;

The shorter syntax can also improve code readability by making it easier to tell at a glance where attributes are used as opposed to generics or shift operators. Lastly, it avoids the problem of needing to modify multiple lines when adding or removing a single attribute:

@@Attr1("foo")
@@Attr2("bar") // this line can be added or removed independent of other lines
public function test() {}

A small side benefit of the @@ syntax is the ability to easily grep for attributes. Besides being used as a shift operator, << also occurs in heredocs/nowdocs which would add noise to simple attribute searches.

Additional examples

So far the RFC has shown attributes on functions, methods, and class properties. For completeness, here's how the proposed syntax looks with classes (or interfaces or traits), class constants, parameters, anonymous classes, and closures:

@@ExampleAttribute
class Foo
{
    @@ExampleAttribute
    public const FOO = 'foo';
 
    public function foo(@@ExampleAttribute Type $bar) {}
}
 
$object = new @@ExampleAttribute class () {};
 
$f1 = @@ExampleAttribute function () {};
 
$f2 = @@ExampleAttribute fn() => 1;

Alternative #[] syntax

An alternative to using @@Attr would be to borrow the #[attr] syntax from Rust. This would have the benefit of reusing the same syntax as another language, and it is also potentially forwards compatible for single-line attributes. E.g. the following code could work with both PHP 7 and PHP 8 (the attribute is treated as a comment on PHP 7):

#[Attribute]
final class Covers {}

So with this syntax, in theory it would be possible for a library to support both native PHP 8 attributes and PHP 7 docblock annotations with the same code. E.g. PHPUnit could introduce a Covers attribute class which would also work in PHP 7 to encapsulate information from an @covers docblock annotation.

However, this benefit would be lost as soon as a library wants to depend on any other PHP 8 features, and it will become irrelevant anyway once most users upgrade to PHP 8. With the @@ syntax, libraries can still support both native attributes and docblock annotations, they just would need to use a different class (or a parent class) to encapsulate the docblock arguments on PHP 7.x.

Downsides

  • Larger BC break than @@ (see Backward Incompatible Changes section below).
  • Slightly more verbose than @@, which works against one of the goals of this RFC.
  • Arguably more difficult to type than @@ on common qwerty keyboard layouts.
  • Syntax may be confusing for some, since it looks more like a comment than the existing docblock annotations developers are familiar with.

Additional examples with #[]

Note: JavaScript syntax highlighting is used so # doesn't appear as a comment.

#[Jit]
function foo() {}
 
class Foo
{
    #[ExampleAttribute]
    public const FOO = 'foo';
 
    // with potential nesting in future
    #[JoinTable(
        "User_Group",
        #JoinColumn("User_id", "id"),
        #JoinColumn("Group_id", "id"),
    )]
    private $groups;
 
    #[ExampleAttribute]
    public function foo(#[ExampleAttribute] Type $bar) {}
}
 
$object = new #[ExampleAttribute] class () {};
 
$f1 = #[ExampleAttribute] function () {};
 
$f2 = #[ExampleAttribute] fn() => 1;

Discussion

Why was the "@:" syntax rejected?

One argument against it was that it is more prone to accidental typos like @;. Another reason that some people disliked it is that it's non-symmetrical, and thus doesn't fit well with existing PHP tokens. The @@ syntax avoids both of these issues.

Will the "@" character make attributes hard to read?

It has been suggested that the @ character could run into other wide characters like M, making attributes starting with that letter harder to read. However, in practice this hasn't been an issue for the many other languages that use the @Attr syntax. This concern is largely dependent on font choice and syntax highlighting.

Will the lack of a closing >> delineator make inline attributes less readable?

There was a concern that it could be harder to tell where an inline attribute ends without a closing >> token. However, for inline class and function attributes, the class/function keyword already provides a clear delineator.

For parameter attributes, this ends up being a bit subjective. Some see >> as clearly marking the end of an attribute, while others find that the >> looks like a shift operator at first glance which makes the syntax harder to read:

function foo(
    <<MyAttr([1, 2])>> Type $myParam,
) {}
 
// vs.
 
function foo(
    @@MyAttr([1, 2]) Type $myParam,
) {}
 
// vs.
 
function foo(
    #[MyAttr([1, 2])] Type $myParam,
) {}

Why not use a keyword instead?

It has been suggested that a keyword be used instead of a symbol. E.g.

attribute Foo();
function myFunc() {}

However, the objective of this proposal is to arrive at a syntax that is less verbose and aligns better with the attribute syntax used in other common languages. Using a keyword doesn't meet either of these goals.

Isn't the syntax choice just something subjective we'll get used to?

To some extent this may be true. However, in this case we believe there are also objective shortcomings with using <<>> for attributes, which we have the opportunity to solve with a shorter syntax.

Comparison to Other Languages

Most other languages with attributes use a variant of [Attr] or @Attr for the syntax. Hack is the only language using <<Attr>>, but apparently they are migrating away from this to @Attr now that compatibility with PHP is no longer a goal.

  • C#: [Attr] 1
  • C++: [[attr]] 2
  • Hack: <<Attr>> 3 (but migrating to @Attr) 4
  • Java: @Attr 5
  • Kotlin: @Attr 6
  • Python: @attr 7
  • Rust: #[attr] 8
  • Swift: @attr 9
  • TypeScript/JS: @Attr 10

Backward Incompatible Changes

In theory there is a small BC break for the @@ syntax, since multiple error suppression operators can currently be added with no additional effect (e.g. @@@@@really_suppress_me()). However, this isn't useful for anything and is very unlikely to be used anywhere.

The alternate #[] syntax presents a larger backwards compatibility break, since it would no longer be possible to begin a hash style comment with a left bracket:

#[x] code like this would break
$val = ['new value']; #['old value'];

While duplicate error suppression operators aren't useful, there is a use for comments starting with a left bracket (e.g. making checkboxes or commenting out an array). There is definitely code in the wild like this that would break. 11

Unaffected Functionality

Attributes can still be applied to all the same places outlined in the Attributes v2 RFC. Non-syntactical attribute functionality also remains unchanged (e.g. the reflection API).

Finally, this proposal does not conflict with the Attribute Amendments RFC, with the exception that if the @@ syntax is accepted, it will supersede the syntax for grouped attributes.

Community Poll

On June 10 there was a poll on Reddit to see which syntax the community prefers. 12

@@ was the most popular, with 436 votes. <<>> came in second place, with 240 votes. #[] came in third place, with 159 votes.

Vote

Are you okay with re-voting on the attribute syntax for PHP 8.0, including the already accepted <<>> option? Yes/No

Secondary vote: ranked-choice vote between @@, #[], and <<>> syntax alternatives.

References

Changelog

  • 2020-06-09 - Added #[Attr] syntax alternative with ranked-choice vote.
  • 2020-06-16 - Summarized community poll and moved alternative syntax proposal before discussion section.
rfc/shorter_attribute_syntax.1592342267.txt.gz · Last modified: 2020/06/16 21:17 by theodorejb