rfc:attributes_v2

PHP RFC: Attributes v2

Large credit for this RFC goes to Dmitry Stogov whose previous work on attributes is the foundation for this RFC and patch.

Introduction

This RFC proposes to add a form of structured, syntactic metadata to declarations of classes, properties, functions, methods and constants.

Similar concepts exist in other languages named Annotations in Java, Attributes in C#, C++, Rust, Hack and Decorators in Python, Javascript.

PHP only offers an unstructured form of such metadata: doc-comments. But these are just strings and to keep some structured information, the @-based pseudo-language was invented inside them by various PHP sub-communities.

Attributes would provide structured meta-information as a native language feature that is parsed, validated and can be accessed through APIs.

This RFC primarily shows that Attributes provides benefits to the PHP core and extensions, even though userland developers also expressed desire for this feature.

Attributes can help move the language forward and provide a generic way to selectively modify functionality that has to be kept in its original form for backwards compatibility reasons and to experiment with these changes in extensions before moving them to core.

In addition the wide spread use of userland doc-comment parsing shows that this is a highly demanded feature by the community. As a first class citizen in PHP, they could:

  • reduce confusion about code in comments for beginners
  • allow language level validation at compile time
  • reduce bugs caused by typos and the difference between "/*" and "/**".
  • help developers migrating configuration (XML, YAML, JSON, INI) into the code, thereby reducing complexity

Proposal

Attribute Syntax

Attributes are a specially formatted text enclosed with "<<" and ">>". Attributes may be applied to functions, classes, interfaces, traits, methods, properties and class constants. In the same way as doc-comments, attributes must be placed before the corresponding definition.

<<...>>
<<...>>
function foo() {}

See discussion about alternative syntaxes below.

Each declaration of function, class, method, property or class constant may have one or more attributes. Each attribute may have no value associated with it, a single value, or multiple values.

See the EBNF grammar:

<attribute> ::= "<<" <class_name_reference> [ "(" <argument> { "," <argument> } ")" ] ">>".
<class_name_reference>      ::= STRING.
<argument>  ::= <php-constant> | <php-expression>.

Here is an example:

<<WithoutArgument>>
<<SingleArgument(0)>>
<<FewArguments('Hello', 'World')>>
function foo() {}

It is not allowed to use the same attribute name more than once on the same declaration and a compile error is thrown when this detected.

Attributes have to be declared after the doc-block or otherwise they are not attached to the following declaration.

As the "<<" and ">>" characters are used in an expression prefix position here, there is no potential conflict for them being used in a potential generics proposal, where "<T>" is the syntax commonly used in other languages.

Attribute Names Resolve to Classes

The name of an attribute is resolved against the currently active namespace import scope during compilation. The resolved class names are then autoloaded to make sure they exist.

This is done to ensure that developers don't introduce subtle bugs when misstyping attribute names, causing the desired behavior of the attribute to be silently ignored.

use My\Attributes\SingleArgument;
use My\Attributes\Another;
 
<<SingleArgument("Hello")>>
<<Another\SingleArgument("World")>>
<<\My\Attributes\FewArguments("foo", "bar")>>
function foo() {}
 
// sample attribute class:
namespace My\Attributes;
 
class SingleArgument implements PhpAttribute {
    public $argumentValue;
 
    public function __construct($argumentValue) {
         $this->argumentValue = $argumentValue;
    }
}

Consistent with PHP expressions in general, no validation is performed if the provided attribute arguments are fullfilling the contract of the attribute class constructor. This would happen only when accessing attributes as objects in the Reflection API (below).

Thanks to class name resolving, IDEs or static analysis tools can perform this validation for the developer.

Attribute classes must extend the marker interface PhpAttribute and an error is thrown if they don't. This is again done to reduce the potential for developers to make mistakes when using attributes.

Constant Expressions in Attribute Arguments

Attribute arguments are evaluated as constant AST expressions, This means that a subset of PHP expressions is allowed as argument:

use My\Attributes\Addition
use My\Attributes\ClassConst;
 
<<Addition(1+1)>>
<<ClassConst(PDO::class)>>

The primary use-case why constant AST is allowed is the ability to reference (class) constants. Referencing constants is desired because it avoids duplicating information into attributes that already exists as a constant. Another benefit is the potential for static verification by tools and IDEs to validate attributes.

The constant AST is resolved to a value when accessing attributes with the Reflection API.

Reflection

The following Reflection classes are extended with the getAttributes() methods, and return array of ReflectionAttribute instances.

function ReflectionFunction::getAttributes(string $name = null): ReflectionAttribute[];
function ReflectionClass::getAttributes(string $name = null): ReflectionAttribute[];
function ReflectionProperty::getAttributes(string $name = null): ReflectionAttribute[];
function ReflectionClassConstant::getAttributes(string $name = null): ReflectionAttribute[];

The name argument can be used to retrieve only the attribute(s) of the given attribute name or subclasses of the given name.

$attributes = $reflectionFunction->getAttributes(\My\Attributes\SingleArgument::class);

The API of the new ReflectionAttribute looks like this:

class ReflectionAttribute
{
    public function getName(): string
    public function getArguments(): array
    public function getAsObject(): object
}

This proposed API aims to be a compromise for previous conflicting opinions, where one faction preferred to have access to only arrays and build their solution on top in userland and the faction that preferred PHP to already convert to attribute object instances.

With this approach we can also guarantee that a call to getAttributes() doesn't throw errors when attributes are invalid according to their userland implementations. This could potentially cause problems when conflicting libraries have attributes on the same declaration.

Use Cases

Use Cases for PHP Extensions

One major use case for attributes will be PHP core and extensions.

HashTables with declared Attributes are available on every zend_class_entry, op_array, zend_property_info and zend_class_constant.

PHP Core or extensions will want to check if certain declarations have an attribute or not.

One such example is the check for “@jit” already in Opcache JIT that can be changed to:

static int zend_needs_manual_jit(const zend_op_array *op_array)
   return op_array->attributes && 
        zend_hash_str_exists(op_array->attributes, "Opcache\\Jit", sizeof("Opcache\\Jit")-1));
}

In userland the code for this functionality becomes:

use Opcache\Jit;
 
<<Jit>>
function foo() {}

Other potential core and extensions use cases/ideas

Previous RFCs (See References below) focussed on benefits for userland. But foremost there are significant benefits to be had for core and extensions if we had attributes.

Below is a list of ideas. Please note that these are not part of this RFC.

Structured Deprecation of functions/methods. Almost all languages with attributes have this built-in as well. One benefit having this in PHP would be that it could allow deprecating classes, properties or constants, where trigger_error cannot be used by developers at the moment.

use Php\Attributes\Deprecated;
 
<<Deprecated("Use bar() instead")>>
function foo() {}

Modification of selected “legacy” behavior of PHP in reference to Reclassify Engine Warnings RFC and Support Rewinding Generators. Rust has a similar set of attributes. This but could be used for augmenting the “Editions” proposal a graudal path to more consistency.

// an idea, not part of the RFC
use Php\Attributes\Deny;
use Php\Attributes\Allow;
 
<<Deny("undeclared_variables")>>
function foo() {
}
<<Allow("rewind_generator")>>
function bar() {
    yield 1;
}
<<Deny("dynamic_properties")>>
class Foo {
}

Some (limited) form of macros similar to Rust could be useful to include polyfill functions only in lower versions of PHP. This helps to not break Opcache with conditional declarations of functions and classes:

// an idea, not part of the RFC
use Php\Attributes\ConditionalDeclare;
use Php\Attributes\IgnoreRedeclaration;
 
<<ConditionalDeclare(PHP_VERSION_ID < 70000)>> // gets removed from AST when >= 7.0
<<IgnoreRedeclaration>> // throws no error when already declared, removes the redeclared thing
function intdiv(int $numerator, int $divisor) {
}

Enabling more aggressive Opcache or JIT optimizations for individual functions:

// an idea, not part of the RFC
use Opcache\OptimizationLevel;
use Opcache\Inline;
 
<<OptimizationLevel(3)>>
<<Inline>>
function foo() {}

A ZEND_API to provide the arguments of a single attribute or a list of all attributes will be part of the final patch so that extension authors can utilize attributes with as little effort as possible.

This API is a draft for now:

/* Retrieve attribute arguments by attribute name */
HashTable *zend_attribute_get(HashTable *attributes, char *name, size_t name_len);
 
/* Retrieve all attribute arguments indexed by attribute name */
zval *zend_attribute_all(HashTable *attributes, char *name, size_t name_len);

Userland Use-Case: Declaring Event Listener Hooks on Objects

In userland attributes provide the benefit of putting declaration and additional configuration directly close to each other.

This is an example of refactoring Symfony EventSubscribers to use attributes instead. The EventSubscriberInterface requires users to declare which event is handled by which method on the class in the getSubscribedEvents() method.

This can be changed to just look for attributes on methods to declare which event they listen to.

// current code without attributes
class RequestSubscriber implements EventSubscriberInterface
{
    public static function getSubscribedEvents(): array
    {
        return [RequestEvent::class => 'onKernelRequest'];
    }
 
    public function onKernelRequest(RequestEvent $event)
    {
    }
}
 
// refactor to:
class Listener implements \PhpAttribute
{
    public $event;
 
    public function __construct(string $event)
    {
        $this->event = $event;
    }
}
 
class RequestSubscriber
{
    <<Listener(RequestEvent::class)>>
    public function onKernelRequest(RequestEvent $event)
    {
    }
}
 
// and the potential EventDispatcher to register listeners based on attributes:
 
class EventDispatcher
{
    private $listeners = [];
 
    public function addSubscriber(object $subscriber)
    {
        $reflection = new ReflectionObject($subscriber);
 
        foreach ($reflection->getMethods() as $method) {
            $attributes = $method->getAttributes(Listener::class);
 
            foreach ($attributes as $listenerAttribute) {
                /** @var $listener Listener */
                $listener = $listenerAttribute->getAsObject();
 
                $this->listeners[$listener->event][] = [$subscriber, $method->getName()];
            }
        }
    }
 
    public function dispatch($event, $args...)
    {
        foreach ($this->listeners[$event] as $listener) {
            $listener(...$args);
        }
    }
}
 
$dispatcher = new EventDispatcher();
$dispatcher->addSubscriber(new RequestSubscriber());
$dispatcher->dispatch(RequestEvent::class, $payload);

Userland Use-Case: Migrating Doctrine Annotations from Docblocks to Attributes

One of the major cases to consider for any attributes/annotations RFC is the potential migration of the wide spread Doctrine Annotations library towards a possible attributes syntax.

PHP cores support for attributes should provide a foundation make userland migrate from docblocks to attributes.

The primary behavior in this RFC that attempts this balancing act is the requirement for namespaced attribute names.

Doctrine or any userland library can utilize the name filter with a parent class to fetch only attributes they are interested in:

namespace Doctrine\Annotations;
 
abstract class Annotation {}
 
class AnnotationReader
{
    function getClassAnnotations(ReflectionClass $reflection) {
        return array_map(
            fn($attribute) => $attribute->getAsObject(),
            $reflection->getAttributes(Annotation::class)
        );
    }
}

This way Doctrine (or any other userland annotation/attributes library) can also enforce stricter rules for use of the attributes by adding their own logic on top wihout PHP attributes getting in the way.

A more complex Doctrine ORM use-case could look like this:

namespace App\Entity {
    use Doctrine\ORM\Mapping as ORM;
 
    <<ORM\Entity(["repositoryClass" => UserRepository::class])>>
    <<ORM\Table("users")>>
    class User
    {
        <<ORM\Id>>
        <<ORM\Column>>
        <<ORM\GeneratedValue>>
        public int $id;
 
        <<ORM\Column(["unique" => true])>>
        public string $email;
 
        <<ORM\ManyToOne>>
        public ?Address $address;
    }
}
 
namespace Doctrine\ORM\Mapping {
    use Doctrine\Annotations\Annotation;
 
    class Entity extends Annotation {
        public $repositoryClass;
        public $readOnly;
 
        public function __construct(array $options)
        {
            $this->repositoryClass = $options['repositoryClass'] ?? null;
            $this->readOnly = $options['readOnly'] ?? false;
        }
    }
 
    class Table extends Annotation {
        public $name;
        public $schema;
 
        public function __construct($name, array $options)
        {
            $this->name = $name;
            $this->schema = $options['schema'] ?? null;
        }
    }
}
 
$reader = new \Doctrine\Annotations\AnnotationReader();
var_dump($reader->getClassAnnotations(new ReflectionClass(\App\Entity\User::class)));
 
array(2) {
  ["Doctrine\ORM\Mapping\Entity"]=>
  object(Doctrine\ORM\Mapping\Entity)#2 (1) {
    ["repositoryClass"]=>
    string(25) "App\Entity\UserRepository"
  },
  ["Doctrine\ORM\Mapping\Table"]=>
  object(Doctrine\ORM\Mapping\Table)#2 (1) {
    ["name"]=>
    string(5) "users"
  }
}

Migration tools such as Rector can help with userland migrating to attributes.

Criticism and Alternative Approaches

Why not use @ or [] syntax like other languages?

The "<<" and ">>" syntax is used because it is one of the few syntaxes that can still be used at this place in the code that looks fairly natural. We could use other symbols that are not yet used as prefix operators, but realistically only "%" is a contender here that doesnt look completly weird. Others included "|", "=" or "/".

Specifically "[]" or "@" are not possible because they conflict with the short array syntax and error suppression operators. Note that even something involved like the following syntax is already valid PHP code right now:

[[@SingleArgument("Hello")]]

It would require looking ahead past potentially unlimited tokens to find out if its an array declaration or an attribute. We would end up with a context sensitive parser, which would be unacceptable outcome.

Why not extending Doc Comments?

Attributes are significantly better than docblock comments so that they warrant being introduced as a new language construct for several reasons:

  • Namespacing prevents conflicts between different libraries using the same doc comment tag
  • Checking for attribute existance is a O(1) hash key test compared to unpredictable strstr performance or even parsing the docblock.
  • Autoloading attribute classes ensures the attributes are correctly typed, reducing major source of bugs in reliance on docblocks at runtime.
  • There is visible demand for something like annotations based on its common use in so many different tools and communities. However this will always be a confusing thing for newcomers to see in comments. In addition the difference between /* and /** is still a very subtle source of bugs.

While it might be possible to make PHP parse existing doc-comments and keep information as structured attributes, but we would need to invoke additional parser for each doc-comment; doc-comment may not conform to context-grammar and we have to decide what to do with grammar errors; finally this is going to be another language inside PHP. This solution would be much more complex than introducing attributes and is not desired.

With attributes as proposed by this RFC, we re-use the existing syntax for expressions and constant expressions. The patch to the core for this functionality is small.

Why not map attributes to simple arrays instead for simplicity?

The previous section already re-iterated on the benefits why a class name resolving of attributes are important. Validation that attributes are correct is one of the primary benefits over the previous approach with doc-comments, where such validation is not possible.

Why not a stricter solution like Doctrine Annotations?

This RFC proposes only base PHP attribute functionality. A general solution for PHP and the wider community must take different use-cases into account and the full Doctrine like system is not necessary for a lot of use-cases, especially the PHP internal use-cases.

One important note: Even if PHP had a very strict attributes support, this would always need to be evaluated at runtime only during access to ReflectionAttribute, because constant ASTs are evaluated at runtime:

- for constants, when a class is instantiated into an object - for attributes, when ReflectionAttribute::getAsObject is called

That means a stricter approach in this RFC doesn't lead to earlier compile time validation.

At that point in the runtime, libraries such as Doctrine can hook on top of attributes and implement the stricter behaviors themselves.

Naming (attributes or annotations)

The name “Attributes” for this feature makes sense to avoid confusion with annotations that are already used. With this distinction Doctrine Annotations is implemented with either docblock (PHP 7) or attributes (PHP 8+).

Backward Incompatible Changes

None

Proposed PHP Version(s)

8.0

RFC Impact

To Core

Requirement to store attributes on every parsing token, ast nodes, zend_class_entry, zend_class_constant, zend_op_array and zend_property_info adds one additional pointer to each strucutre, even those that doesn't use attributes.

To SAPIs

None

To Existing Extensions

None

Opcache JIT will move to use Opcache\Jit instead of @jit and Opcache\Nojit instead of @nojit attributes, but this is currently an unreleased feature.

To Opcache

opcache modifications are parts of the proposed patch, might not be working 100% after internal changes from original 7.1 patch to 8.0

New Constants

None

php.ini Defaults

None

Open Issues

  • Should we allow constant expressions or are basic type declarations (bool, int, string, array, float) already enough for arguments of attributes?
  • Is the naming of ReflectionAttribute methods ok?
  • Should attribute classes be required to implement an interface PhpAttribute? Or the built-in attribute «\PhpAttribute»? Or no requirement?
  • Should attributes already provide a way to specify which declarations they target and throw errors on wrong placement? Example: «Target([“class”, “function”, “property”])»
  • When to autoload the attributes and validate the constructor? Immediately at compile time (possible?) or only when Reflection::getAttributes() is called?
  • Should there be an alternative “short” syntax to declare attributes in one enclosing? «SingleArgument(“foo”), MultiArgument(“bar”, “baz”)»

Future Scope

  • Attributes provide the opportunity to augment existing functionality with new behavior without breaking backwards compatibility. One example is introduction of a “«Rewindable»” attribute that could be used to signal that a generator function creates a rewindable iterator.
  • Other languages such as Go have simple but powerful serialization from XML/JSON to objects and back. The combination of typed properties an attributes puts this in reach for core or a PHP extension to implement.
  • In the absence of generics attributes could provide an extension based type- or assertion system to make runtime assertions on complex/arrray types that could be disabled in production (similar to asserts and how Hack works)

Proposed Voting Choices

  • Accept PHP Attributes v2 into core? 2/3 majority + 1

Patches and Tests

References

Changelog

0.2:

  • Added new, hopefully more simple userland example
  • Changed Php\Attribute to PhpAttribute because the PHP namespace has not been reserved for PHP and is currently “empty”
  • Clarify necessary order of docblocks, attributes and function declarations
  • Clarify no conflict with potential generics syntax
rfc/attributes_v2.txt · Last modified: 2020/03/11 17:49 by beberlei