This is an old revision of the document!
PHP RFC: Attributes v2
- Version: 0.1
- Date: 2020-03-09
- Author: Benjamin Eberlei, beberlei@php.net
- Status: Under Discussion
- First Published at: http://wiki.php.net/rfc/attributes_v2
Large credit for this RFC goes to Dmitry Stogov whose previous work on attributes is the foundation for this RFC and patch.
Introduction
This RFC proposes to add a form of structured, syntactic metadata to declarations of classes, properties, functions, methods and constants.
Similar concepts exist in other languages named Annotations in Java, Attributes in C#, C++, Rust, Hack and Decorators in Python, Javascript.
PHP only offers an unstructured form of such metadata: doc-comments. But these are just strings and to keep some structured information, the @-based pseudo-language was invented inside them by various PHP sub-communities.
Attributes would provide structured meta-information as a native language feature that is parsed, validated and can be accessed through APIs.
I want to primarily show that Attributes provides benefits to the PHP core and extensions, even though userland developers also expressed desire for this feature.
Attributes can help move the language forward and provide a generic way to selectively modify functionality that has to be kept in its original form for backwards compatibility reasons and to experiment with these changes in extensions before moving them to core.
In addition the wide spread use of userland doc-comment parsing shows that this is a highly demanded feature by the community. As a first class citizen in PHP, they could:
- reduce confusion about code in comments for beginners
- allow language level validation at compile time
- reduce bugs caused by typos and the difference between "/*" and "/**".
- help developers migrating configuration (XML, YAML, JSON, INI) into the code, thereby reducing complexity
Proposal
Attribute Syntax
Attributes are a specially formatted text enclosed with "<<" and ">>". Attributes may be applied to functions, classes, interfaces, traits, methods, properties and class constants. In the same way as doc-comments, attributes must be placed before the corresponding definition.
<<...>> <<...>> function foo() {}
See discussion about alternative syntaxes below.
Each declaration of function, class, method, property or class constant may have one or more attributes. Each attribute may have no value associated with it, a single value, or multiple values.
See the EBNF grammar:
<attribute> ::= "<<" <class_name_reference> [ "(" <argument> { "," <argument> } ")" ] ">>". <class_name_reference> ::= STRING. <argument> ::= <php-constant> | <php-expression>.
Here is an example:
<<WithoutArgument>> <<SingleArgument(0)>> <<FewArguments('Hello', 'World')>> function foo() {}
It is not allowed to use the same attribute name more than once on the same declaration and a compile error is thrown when this detected.
Attribute Names Resolve to Classes
The name of an attribute is resolved against the currently active namespace import scope during compilation. The resolved class names are then autoloaded to make sure they exist.
This is done to ensure that developers don't introduce subtle bugs when misstyping attribute names, causing the desired behavior of the attribute to be silently ignored.
use My\Attributes\SingleArgument; use My\Attributes\Another; <<SingleArgument("Hello")>> <<Another\SingleArgument("World")>> <<\My\Attributes\FewArguments("foo", "bar")>> function foo() {} // sample attribute class: namespace My\Attributes; class SingleArgument implements \Php\Attribute { public $argumentValue; public function __construct($argumentValue) { $this->argumentValue = $argumentValue; } }
Consistent with PHP expressions in general, no validation is performed if the provided attribute arguments are fullfilling the contract of the attribute class constructor. This would happen only when accessing attributes as objects in the Reflection API (below).
Thanks to class name resolving, IDEs or static analysis tools can perform this validation for the developer.
Attribute classes must extend the marker interface Php\Attribute and an error is thrown if they don't. This is again done to reduce the potential for developers to make mistakes when using attributes.
Constant Expressions in Attribute Arguments
Attribute arguments are evaluated as constant AST expressions, This means that a subset of PHP expressions is allowed as argument:
use My\Attributes\Addition use My\Attributes\ClassConst; <<Addition(1+1)>> <<ClassConst(PDO::class)>>
The primary use-case why constant AST is allowed is the ability to reference (class) constants. Referencing constants is desired because it avoids duplicating information into attributes that already exists as a constant. Another benefit is the potential for static verification by tools and IDEs to validate attributes.
The constant AST is resolved to a value when accessing attributes with the Reflection API.
Reflection
The following Reflection classes are extended with the getAttributes() methods, and return array of ReflectionAttribute instances.
function ReflectionFunction::getAttributes(string $name = null): ReflectionAttribute[]; function ReflectionClass::getAttributes(string $name = null): ReflectionAttribute[]; function ReflectionProperty::getAttributes(string $name = null): ReflectionAttribute[]; function ReflectionClassConstant::getAttributes(string $name = null): ReflectionAttribute[];
The name argument can be used to retrieve only the attribute(s) of the given attribute name or subclasses of the given name.
$attributes = $reflectionFunction->getAttributes(\My\Attributes\SingleArgument::class);
The API of the new ReflectionAttribute looks like this:
class ReflectionAttribute { public function getName(): string public function getArguments(): array public function getAsObject(): object }
This proposed API aims to be a compromise for previous conflicting opinions, where one faction preferred to have access to only arrays and build their solution on top in userland and the faction that preferred PHP to already convert to attribute object instances.
Use Cases for PHP Extensions
One major use case for attributes will be PHP core and extensions.
HashTables with declared Attributes are available on every zend_class_entry, op_array, zend_property_info and zend_class_constant.
PHP Core or extensions will want to check if certain declarations have an attribute or not.
One such example is the check for “@jit” already in Opcache JIT that can be changed to:
static int zend_needs_manual_jit(const zend_op_array *op_array) return op_array->attributes && zend_hash_str_exists(op_array->attributes, "Opcache\\Jit", sizeof("Opcache\\Jit")-1)); }
In userland the code for this functionality becomes:
use Opcache\Jit; <<Jit>> function foo() {}
Other potential core and extensions use cases/ideas
Previous RFCs (See References below) focussed on benefits for userland. But foremost there are significant benefits to be had for core and extensions if we had attributes.
Below is a list of ideas. Please note that these are not part of this RFC.
Structured Deprecation of functions/methods. Almost all languages with attributes have this built-in as well. One benefit having this in PHP would be that it could allow deprecating classes, properties or constants, where trigger_error cannot be used by developers at the moment.
use Php\Attributes\Deprecated; <<Deprecated("Use bar() instead")>> function foo() {}
Modification of selected “legacy” behavior of PHP in reference to Reclassify Engine Warnings RFC and Support Rewinding Generators. Rust has a similar set of attributes. This but could be used for augmenting the “Editions” proposal a graudal path to more consistency.
// an idea, not part of the RFC use Php\Attributes\Deny; use Php\Attributes\Allow; <<Deny("undeclared_variables")>> function foo() { } <<Allow("rewind_generator")>> function bar() { yield 1; } <<Deny("dynamic_properties")>> class Foo { }
Some (limited) form of macros similar to Rust could be useful to include polyfill functions only in lower versions of PHP. This helps to not break Opcache with conditional declarations of functions and classes:
// an idea, not part of the RFC use Php\Attributes\ConditionalDeclare; use Php\Attributes\IgnoreRedeclaration; <<ConditionalDeclare(PHP_VERSION_ID < 70000)>> // gets removed from AST when >= 7.0 <<IgnoreRedeclaration>> // throws no error when already declared, removes the redeclared thing function intdiv(int $numerator, int $divisor) { }
Enabling more aggressive Opcache or JIT optimizations for individual functions:
> // an idea, not part of the RFC use Opcache\OptimizationLevel; use Opcache\Inline; <<OptimizationLevel(3)>> <<Inline>> function foo() {}
A ZEND_API to provide the arguments of a single attribute or a list of all attributes will be part of the final patch so that extension authors can utilize attributes with as little effort as possible.
This API is a draft for now:
/* Retrieve attribute arguments by attribute name */ HashTable *zend_attribute_get(HashTable *attributes, char *name, size_t name_len); /* Retrieve all attribute arguments indexed by attribute name */ zval *zend_attribute_all(HashTable *attributes, char *name, size_t name_len);
Userland Use-Case: Declaring Validation Rules on Objects
In userland attributes provide the benefit of putting declaration and additional configuration directly close to each other.
As an example I show the potential of a validation library using attributes:
use Validator\Attributes\Email; use Validator\Attributes\NotEmpty; use Validator\Attributes\Maxlenth; class User { <<NotEmpty>> <<Maxlength(255)>> public $username; <<Email>> public $email; } class Validator { public function validateObject(object $object) { $reflection = new ReflectionObject($object); foreach ($reflection->getProperties() as $property) { $this->validateProperty($object, $property); } } private function validateProperty($object, ReflectionProperty $property) { $value = $property->getValue($object); $validationAttributes = $property->getAttributes(\Validator\Attributes\Attribute::class); foreach ($validationAttributes as $validationAttribute) { switch ($validationAttribute->getName()) { case Email::class: // validate $value as email break; case NotEmpty::class: if (empty($value)) { throw new ValidationException(); } break; case Maxlength::class: $maxlength = $validationAttribute->getAsObject(); if (strlen($value) > $maxlength->value) { throw new ValidationException(); } break; } } } } $user = new User(); $user->username = $request->get('username'); $user->email = $request->get('email');
Userland Use-Case: Migrating Doctrine Annotations from Docblocks to Attributes
One of the major cases to consider for any attributes/annotations RFC is the potential migration of the wide spread Doctrine Annotations library towards a possible attributes syntax.
PHP cores support for attributes should provide a foundation make userland migrate from docblocks to attributes.
The primary behavior in this RFC that attempts this balancing act is the requirement for namespaced attribute names.
Doctrine or any userland library can utilize the name filter with a parent class to fetch only attributes they are interested in:
namespace Doctrine\Annotations; abstract class Annotation {} class AnnotationReader { function getClassAnnotations(ReflectionClass $reflection) { return array_map( fn($attribute) => $attribute->getAsObject(), $reflection->getAttributes(Annotation::class) ); } }
This way Doctrine (or any other userland annotation/attributes library) can also enforce stricter rules for use of the attributes by adding their own logic on top wihout PHP attributes getting in the way.
A more complex Doctrine ORM use-case could look like this:
namespace App\Entity { use Doctrine\ORM\Mapping as ORM; <<ORM\Entity(["repositoryClass" => UserRepository::class])>> <<ORM\Table("users")>> class User { <<ORM\Id, ORM\Column, ORM\GeneratedValue>> public int $id; <<ORM\Column(["unique" => true])>> public string $email; <<ORM\ManyToOne()>> public ?Address $address; } } namespace Doctrine\ORM\Mapping { use Doctrine\Annotations\Annotation; class Entity extends Annotation { public $repositoryClass; public $readOnly; public function __construct(array $options) { $this->repositoryClass = $options['repositoryClass'] ?? null; $this->readOnly = $options['readOnly'] ?? false; } } class Table extends Annotation { public $name; public $schema; public function __construct($name, array $options) { $this->name = $name; $this->schema = $options['schema'] ?? null; } } } $reader = new \Doctrine\Annotations\AnnotationReader(); var_dump($reader->getClassAnnotations(new ReflectionClass(\App\Entity\User::class))); array(2) { ["Doctrine\ORM\Mapping\Entity"]=> object(Doctrine\ORM\Mapping\Entity)#2 (1) { ["repositoryClass"]=> string(25) "App\Entity\UserRepository" }, ["Doctrine\ORM\Mapping\Table"]=> object(Doctrine\ORM\Mapping\Table)#2 (1) { ["name"]=> string(5) "users" } }
Migration tools such as Rector can help with userland migrating to attributes.
Criticism and Alternative Approaches
Why not use @ or [] syntax like everyone else?
The "<<" and ">>" syntax is used because it is one of the few syntaxes that can still be used at this place in the code that looks fairly natural. We could use other symbols that are not yet used as prefix operators, but realistically only "%" is a contender here that doesnt look completly weird. Others included "|", "=" or "/".
Specifically "[]" or "@" are not possible because they conflict with the short array syntax and error suppression operators. Note that even something involved like the following syntax is already valid PHP code right now:
[[@SingleArgument("Hello")]]
It would require looking ahead past potentially unlimited tokens to find out if its an array declaration or an attribute. We would end up with a context sensitive parser, which would be unacceptable outcome.
We can do everything with Doc Comments
Attributes are significantly better than docblock comments so that they warrant being introduced as a new language construct for several reasons:
- Namespacing prevents conflicts between different libraries using the same doc comment tag
- Checking for attribute existance is a O(1) hash key test compared to unpredictable strstr performance or even parsing the docblock.
- Autoloading attribute classes ensures the attributes are correctly typed, reducing major source of bugs in reliance on docblocks at runtime.
- There is visible demand for something like annotations based on its common use in so many different tools and communities. However this will always be a confusing thing for newcomers to see in comments. In addition the difference between /* and /** is still a very subtle source of bugs.
While it might be possible to make PHP parse existing doc-comments and keep information as structured attributes, but we would need to invoke additional parser for each doc-comment; doc-comment may not conform to context-grammar and we have to decide what to do with grammar errors; finally this is going to be another language inside PHP. This solution would be much more complex than introducing attributes and is not desired.
With attributes as proposed by this RFC, we re-use the existing syntax for expressions and constant expressions. The patch to the core for this functionality is small.
Why not map attributes to simple arrays instead for simplicity?
The previous section already re-iterated on the benefits why a class name resolving of attributes are important. Validation that attributes are correct is one of the primary benefits over the previous approach with doc-comments, where such validation is not possible.
We want a stricter solution like Doctrine Annotations
This RFC proposes only base PHP attribute functionality. A general solution for PHP and the wider community must take different use-cases into account and the full Doctrine like system is not necessary for a lot of use-cases, especially the PHP internal use-cases.
One important note: Even if PHP had a very strict attributes support, this would always need to be evaluated at runtime only during access to ReflectionAttribute, because constant ASTs are evaluated at runtime:
- for constants, when a class is instantiated into an object - for attributes, when ReflectionAttribute::getAsObject is called
That means a stricter approach in this RFC doesn't lead to earlier compile time validation.
At that point in the runtime, libraries such as Doctrine can hook on top of attributes and implement the stricter behaviors themselves.
Naming (attributes or annotations)
The name “Attributes” for this feature makes sense to avoid confusion with annotations that are already used. With this distinction Doctrine Annotations is implemented with either docblock (PHP 7) or attributes (PHP 8+).
Backward Incompatible Changes
None
Proposed PHP Version(s)
8.0
RFC Impact
To Core
Requirement to store attributes on every parsing token, ast nodes, zend_class_entry, zend_class_constant, zend_op_array and zend_property_info adds one additional pointer to each strucutre, even those that doesn't use attributes.
To SAPIs
None
To Existing Extensions
None
Opcache JIT will move to use Opcache\Jit instead of @jit and Opcache\Nojit instead of @nojit attributes, but this is currently an unreleased feature.
To Opcache
opcache modifications are parts of the proposed patch, might not be working 100% after internal changes from original 7.1 patch to 8.0
New Constants
None
php.ini Defaults
None
Open Issues
- Should we allow constant expressions or are basic type declarations (bool, int, string, array, float) already enough for arguments of attributes?
- Is the naming of ReflectionAttribute methods ok?
- Should attribute classes be required to implement an interface Php\Attribute? Or the built-in attribute <<\Php\Attribute>? Or no requirement?
Future Scope
- Attributes provide the opportunity to augment existing functionality with new behavior without breaking backwards compatibility. One example is introduction of a “<<Rewindable>>” attribute that could be used to signal that a generator function creates a rewindable iterator.
- Other languages such as Go have simple but powerful serialization from XML/JSON to objects and back. The combination of typed properties an attributes puts this in reach for core or a PHP extension to implement.
- In the absence of generics attributes could provide an extension based type- or assertion system to make runtime assertions on complex/arrray types that could be disabled in production (similar to asserts and how Hack works)
Proposed Voting Choices
- Accept PHP Attributes v2 into core? 2/3 majority + 1