====== PHP RFC: Attributes v2 ======
* Version: 0.5
* Date: 2020-03-09
* Author: Benjamin Eberlei (beberlei@php.net), Martin Schröder
* Status: Implemented
* Target: 8.0
* First Published at: http://wiki.php.net/rfc/attributes_v2
* Implementation: https://github.com/php/php-src/pull/5394
Large credit for this RFC goes to Dmitry Stogov whose previous work on
attributes is the foundation for this RFC and patch.
===== Introduction =====
This RFC proposes Attributes as a form of structured, syntactic metadata to
declarations of classes, properties, functions, methods, parameters and
constants. Attributes allow to define configuration directives directly embedded
with the declaration of that code.
Similar concepts exist in other languages named **Annotations** in Java,
**Attributes** in C#, C++, Rust, Hack and **Decorators** in Python, Javascript.
So far PHP only offers an unstructured form of such metadata: doc-comments.
But doc-comments are just strings and to keep some structured information, the
@-based pseudo-language was invented inside them by various PHP
sub-communities.
On top of userland use-cases there are many use-cases for attributes in the
engine and extensions that could affect compilation, diagnostics,
code-generation, runtime behavior and more. Examples are given below.
The wide spread use of userland doc-comment parsing shows that this is a highly
demanded feature by the community.
===== Proposal =====
==== Attribute Syntax ====
Attributes are a specially formatted text enclosed with "<<" and ">>" by reusing the existing tokens T_SL and T_SR.
attributes may be applied to many things in the language:
* functions (including closures and short closures)
* classes (including anonymous classes), interfaces, traits
* class constants
* class properties
* class methods
* function/method parameters
Examples:
<>
class Foo
{
<>
public const FOO = 'foo';
<>
public $x;
<>
public function foo(<> $bar) { }
}
$object = new <> class () { };
<>
function f1() { }
$f2 = <> function () { };
$f3 = <> fn () => 1;
Attributes are added before the declaration they belong to, similar to doc-block comments. They can be declared **before** or **after** a doc-block comment that documents a declaration.
<>
/** docblock */
<>
function foo() {}
Each declaration of function, class, method, property, parameter or class
constant may have one or more attributes.
Each attribute may have values associated with it, but doesn't have to, similar
to how a constructor of a class works.
<>
<>
<>
function foo() {}
The same attribute name can be used more than once on the same declaration.
Attributes can also be declared on the same line:
<><><>
function foo() {}
Semantically the attribute declaration should be read as instantiating a class
with the attribute name and passing arguments to the constructor.
Note: As the "<<" and ">>" characters are used in an expression prefix
position here, there is no potential conflict for them being used in a
potential generics proposal, where "" is the syntax commonly used in other
languages.
Since syntax is by far the most discussed point about this RFC, we also thought
of an alternative by introducing a new token for attributes (T_ATTRIBUTE)
defined as //@:// that the parser could look for. Choice of syntax will be a secondary
vote on the RFC.
@:WithoutArgument
@:SingleArgument(0)
@:FewArguments('Hello', 'World')
function foo() {}
"The "Smiley" syntax uses the shorter, more familiar "at" symbol commonly seen in docblocks. The downside is that it does not permit whitespace in attribute names to allow detecting the ending of the declaration."
See discussion about alternative syntaxes below for more info why the most
requested syntaxes "@" and "[]" are not possible.
==== Attribute Names Resolve to Classes ====
The name of an attribute is resolved against all currently imported symbols
during compilation. This is done to namespace attributes and avoid accidental re-use of the
same attribute name by different libraries and applications.
use My\Attributes\SingleArgument;
use My\Attributes\Another;
<>
<>
<<\My\Attributes\FewArguments("foo", "bar")>>
function foo() {}
There are also benefits to declaring this attribute class in code:
* Reflection API can directly convert an attribute to an instance of this class (see "Reflection" section below)
* Static analysis tools can verify attributes are correctly used in your code
* IDEs can add support to autocomplete attributes and their arguments
Declaring an attribute classs for the above example looks like this:
namespace My\Attributes;
use PhpAttribute;
<>
class SingleArgument
{
public $value;
public function __construct(string $value)
{
$this->value = $value;
}
}
==== Compiler and Userland Attributes ====
This proposal differentiates between two different kinds of attributes:
* Compiler Attributes (validated at compile time)
* Userland Attributes (validated during Reflection API access)
A compiler attribute is an internal class that is attributed with the //PhpCompilerAttribute// attribute.
A userland attribute is an userland class that is attributed with the //PhpAttribute// attribute.
When a compiler attribute is found during compile time then the engine invokes
a validation callback that is registered for every compiler attribute.
For example the patch includes a validation callback for //PhpCompilerAttribute// that prevents
its use by userland classes:
#include "zend_attributes.h"
void zend_attribute_validate_phpcompilerattribute(zval *attribute, int target)
{
if (target != ZEND_ATTRIBUTE_TARGET_CLASS) {
zend_error(E_COMPILE_ERROR, "The PhpCompilerAttribute can only be used on class declarations and only on internal classes");
} else {
zend_error(E_COMPILE_ERROR, "The PhpCompilerAttribute can only be used by internal classes, use PhpAttribute instead");
}
}
INIT_CLASS_ENTRY(ce, "PhpCompilerAttribute", NULL);
zend_ce_php_compiler_attribute = zend_register_internal_class(&ce);
zend_compiler_attribute_register(zend_ce_php_compiler_attribute, zend_attribute_validate_phpcompilerattribute);
The attribute zval contains all arguments passed and target is a constant that allows
validating the attribute is on the right declaration.
Userland classes can not use the //PhpCompilerAttribute//. An error is thrown if this happens.
>
class MyAttribute
{
}
// Fatal error: The PhpCompilerAttribute can only be used by internal classes, use PhpAttribute instead
By mapping attributes to classes tools, editors and IDEs can provide both
syntactial and context information about the use of attributes to
developers.
The downside of this approach is that mistyped compiler attributes get
classified as userland attributes.
==== Constant Expressions in Attribute Arguments ====
Attribute arguments are evaluated as constant AST expressions, This means that
a subset of PHP expressions is allowed as argument:
<>
<>
The primary use-case why constant AST is allowed is the ability to reference
(class) constants. Referencing constants is desired because it avoids
duplicating information into attributes that already exists as a constant.
Another benefit is the potential for static verification by tools and IDEs to
validate attributes.
The constant AST is resolved to a value when accessing attributes with the Reflection API.
**Note:** This is intentionally different from the previous Attributes RFC where an object
with ast\node was returned.
The parser understands the context to differentiate attributes from bitshifts in constant ASTs.
<> 1, 4 << 1)>>
function foo() {}
==== Reflection ====
The following Reflection classes are extended with the getAttributes() methods,
and return array of ReflectionAttribute instances.
function ReflectionFunction::getAttributes(string $name = null, int $flags = 0): ReflectionAttribute[];
function ReflectionClass::getAttributes(string $name = null, int $flags = 0): ReflectionAttribute[];
function ReflectionProperty::getAttributes(string $name = null, int $flags = 0): ReflectionAttribute[];
function ReflectionClassConstant::getAttributes(string $name = null, int $flags = 0): ReflectionAttribute[];
The name argument can be used to retrieve only the attribute(s) of the given
attribute name or subclasses of the given name.
$attributes = $reflectionFunction->getAttributes(\My\Attributes\SingleArgument::class);
When the flags parameter is not set, then getAttributes defaults to returning
only those attributes with the exact same name as given in the first argument.
When you pass the constant //ReflectionAttribute::IS_INSTANCEOF// then it returns
all attributes that pass an instanceof check with the passed class name.
$attributes = $reflectionFunction->getAttributes(
\My\Attributes\MyAbstractAttribute::class,
\ReflectionAttribute::IS_INSTANCEOF
);
The API of the new ReflectionAttribute looks like this:
class ReflectionAttribute
{
public function getName(): string
public function getArguments(): array
public function newInstance(): object
}
Because validation of attributes is only performed during
//ReflectionAttribute::newInstance()//, it is technically not required to
declare the attribute class. You can still acccess name and arguments directly
from //ReflectionAttribute//.
Full example:
namespace My\Attributes {
<>
class SingleArgument {
public $argumentValue;
public function __construct($argumentValue) {
$this->argumentValue = $argumentValue;
}
}
}
namespace {
<>
class Foo {
}
$reflectionClass = new \ReflectionClass(Foo::class);
$attributes = $reflectionClass->getAttributes();
var_dump($attributes[0]->getName());
var_dump($attributes[0]->getArguments());
var_dump($attributes[0]->newInstance());
}
/**
string(28) "My\Attributes\SingleArgument"
array(1) {
[0]=>
string(11) "Hello World"
}
object(My\Attributes\SingleArgument)#1 (1) {
["argumentValue"]=>
string(11) "Hello World"
}
**/
With this approach a call to //getAttributes()// never throws errors. This will
avoid problems when different libraries with different semantics are parsing
attributes on the same declaration.
===== Use Cases =====
==== Use Cases for PHP Extensions ====
One major use case for attributes will be PHP core and extensions.
HashTables with declared Attributes are available on every //zend_class_entry//, //op_array//,
//zend_property_info// and //zend_class_constant//.
PHP Core or extensions will want to check if certain declarations have an attribute or not.
One such example is the existing check for "@jit" in Opcache JIT that instructs the JIT
to always optimize a function or method.
With attributes it can be changed to the following C code in the extension:
static int zend_needs_manual_jit(const zend_op_array *op_array)
return op_array->attributes &&
zend_hash_str_exists(op_array->attributes, "opcache\\jit", sizeof("opcache\\jit")-1));
}
Developers could then use an attribute instead of a doc-comment:
use Opcache\Jit;
<>
function foo() {}
==== Other potential core and extensions use cases/ideas ====
Below is a list of ideas. Please note that these are not part of this RFC.
Structured Deprecation of functions/methods. Almost all languages with
attributes have this built-in as well. One benefit having this in PHP would be
that it could allow deprecating classes, properties or constants, where
trigger_error cannot be used by developers at the moment.
// an idea, not part of the RFC
use Php\Attributes\Deprecated;
<>
function foo() {}
A //Deprecated// attribute would have the benefit of allowing to deprecate properties and constants, which is currently not possible
using //trigger_error//.
class Foo
{
<>
const BAR = 'BAR';
}
echo Foo::BAR;
// PHP Deprecated: Constant Foo::BAR is deprecated in test.php on line 7
Opt-in change of "legacy" behavior of PHP for example as proposed in
[[https://wiki.php.net/rfc/engine_warnings|Reclassify Engine Warnings RFC]] and
[[https://externals.io/message/108767#108767|Support Rewinding Generators]].
Rust has a
[[https://doc.rust-lang.org/reference/attributes/diagnostics.html#lint-check-attributes|similar
set of attributes]]. This but could be used for augmenting the "Editions" proposal a graudal path to more consistency.
// an idea, not part of the RFC
use Php\Attributes\Deny;
use Php\Attributes\Allow;
<>
function bar() {
yield 1;
}
<>
function foo() {
echo $foo;
// PHP Fatal error: Uncaught TypeError: Access to undeclared variable $foo
}
<>
class Foo {
}
$foo->bar; // PHP Fatal error: Uncaught Error: Invalid access to dynamic property Foo::$bar
Some (limited) form of macros [[https://doc.rust-lang.org/reference/conditional-compilation.html#the-cfg-attribute|similar
to Rust]] could be useful to include polyfill functions only in lower versions
of PHP. This helps libraries to conditionally declare code compatible with Opcache and preloading:
// an idea, not part of the RFC
use Php\Attributes\ConditionalDeclare;
use Php\Attributes\IgnoreRedeclaration;
<> // gets removed from AST when >= 7.0
<> // throws no error when already declared, removes the redeclared thing
function intdiv(int $numerator, int $divisor) {
}
A ZEND_API to provide the arguments of a single attribute or a list of all
attributes will be part of the final patch so that extension authors can
utilize attributes with as little effort as possible.
This API is a draft for now:
/* Retrieve attribute arguments by attribute name */
HashTable *zend_attribute_get(HashTable *attributes, char *name, size_t name_len);
/* Retrieve all attribute arguments indexed by attribute name */
zval *zend_attribute_all(HashTable *attributes, char *name, size_t name_len);
==== Userland Use-Case: Declaring Event Listener Hooks on Objects ===
In userland attributes provide the benefit of putting declaration
and additional configuration directly close to each other.
This is an example of refactoring Symfony EventSubscribers to use
attributes instead. The //EventSubscriberInterface// requires
users to declare which event is handled by which method on the class
in the //getSubscribedEvents()// method.
This can be changed to just look for attributes on methods
to declare which event they listen to.
// current code without attributes
class RequestSubscriber implements EventSubscriberInterface
{
public static function getSubscribedEvents(): array
{
return [RequestEvent::class => 'onKernelRequest'];
}
public function onKernelRequest(RequestEvent $event)
{
}
}
// refactor to:
<>
class Listener
{
public $event;
public function __construct(string $event)
{
$this->event = $event;
}
}
class RequestSubscriber
{
<>
public function onKernelRequest(RequestEvent $event)
{
}
}
// and the EventDispatcher to register listeners based on attributes:
class EventDispatcher
{
private $listeners = [];
public function addSubscriber(object $subscriber)
{
$reflection = new ReflectionObject($subscriber);
foreach ($reflection->getMethods() as $method) {
// Does this method has Listener attributes?
$attributes = $method->getAttributes(Listener::class);
foreach ($attributes as $listenerAttribute) {
/** @var $listener Listener */
$listener = $listenerAttribute->newInstance();
// with $listener instanceof Listener attribute,
// register the method to the given Listener->event
// as a callable
$this->listeners[$listener->event][] = [$subscriber, $method->getName()];
}
}
}
public function dispatch($event, $args...)
{
foreach ($this->listeners[$event] as $listener) {
// invoke the listener callables registered to an event name
$listener(...$args);
}
}
}
$dispatcher = new EventDispatcher();
$dispatcher->addSubscriber(new RequestSubscriber());
$dispatcher->dispatch(RequestEvent::class, $payload);
==== Userland Use-Case: Migrating Doctrine Annotations from Docblocks to Attributes ====
One of the major cases to consider for any attributes/annotations RFC is the
potential migration of the wide spread Doctrine Annotations library towards a
possible attributes syntax.
PHP cores support for attributes should provide a foundation make userland
migrate from docblocks to attributes.
The primary behavior in this RFC that attempts this balancing act is the
requirement for namespaced attribute names.
Doctrine or any userland library can utilize the name filter with a parent class to fetch
only attributes they are interested in. With the flexibility in the proposed Reflection API, Doctrine (or any other userland
annotation/attributes library) can enforce stricter rules for use of the
attributes by adding their own logic on top wihout PHP attributes getting in
the way.
Here is a complex example of an object using Doctrine Annotations and the proposed Attributes side by side to implement the same thing:
>
/** @ORM\Entity */
class User
{
/** @ORM\Id @ORM\Column(type="integer"*) @ORM\GeneratedValue */
<><><>
private $id;
/**
* @ORM\Column(type="string", unique=true)
* @Assert\Email(message="The email '{{ value }}' is not a valid email.")
*/
<>
< "The email '{{ value }}' is not a valid email."))>>
private $email;
/**
* @ORM\Column(type="integer")
* @Assert\Range(
* min = 120,
* max = 180,
* minMessage = "You must be at least {{ limit }}cm tall to enter",
* maxMessage = "You cannot be taller than {{ limit }}cm to enter"
* )
*/
< 120, "max" => 180, "minMessage" => "You must be at least {{ limit }}cm tall to enter"])>>
<>
protected $height;
/**
* @ORM\ManyToMany(targetEntity="Phonenumber")
* @ORM\JoinTable(name="users_phonenumbers",
* joinColumns={@ORM\JoinColumn(name="user_id", referencedColumnName="id")},
* inverseJoinColumns={@ORM\JoinColumn(name="phonenumber_id", referencedColumnName="id", unique=true)}
* )
*/
<>
<>
<>
<>
private $phonenumbers;
}
The attributes approach is a bit limited, because it doesn't support named params. But this is the reason why attributes uses a function call like syntax, if the PHP language makes improvements with named params, then attributes would automatically benefit.
[[https://github.com/RectorPHP/Rector|Migration tools such as Rector]] can help with userland migrating to attributes.
===== Criticism and Alternative Approaches =====
=== Alternative Syntax: Why not use @ or [] like other languages? ===
The "<<" and ">>" syntax is used because it is one of the few syntaxes
that can still be used at this place in the code that looks fairly natural. We
could use other symbols that are not yet used as prefix operators, but
realistically only "%" is a contender here that doesnt look completly weird.
Others included "|", "=" or "/".
Specifically "[]" or "@" are not possible because they conflict with
the short array syntax and error suppression operators. Note that even
something involved like the following syntax is already valid PHP code right
now:
[[@SingleArgument("Hello")]]
It would require looking ahead past potentially unlimited tokens to find out if
its an array declaration or an attribute. We would end up with a context
sensitive parser, which would be an unacceptable outcome.
=== Why not extending Doc Comments? ===
Attributes are significantly better than docblock comments so that they warrant
being introduced as a new language construct for several reasons:
* Namespacing prevents conflicts between different libraries using the same doc comment tag
* Checking for attribute existance is a O(1) hash key test compared to unpredictable strstr performance or even parsing the docblock.
* Mapping attributes to classes ensures the attributes are correctly typed, reducing major source of bugs in reliance on docblocks at runtime.
* There is visible demand for something like annotations based on its common use in so many different tools and communities. However this will always be a confusing thing for newcomers to see in comments. In addition the difference between /* and /** is still a very subtle source of bugs.
While it might be possible to make PHP parse existing doc-comments and keep
information as structured attributes, but we would need to invoke additional
parser for each doc-comment; doc-comment may not conform to context-grammar
and we have to decide what to do with grammar errors; finally this is going
to be another language inside PHP. This solution would be much more complex
than introducing attributes and is not desired.
With attributes as proposed by this RFC, we re-use the existing syntax for
expressions and constant expressions. The patch to the core for this
functionality is small.
=== Why not always map attributes to simple arrays instead for simplicity? ===
The previous section already re-iterated on the benefits why a class name
resolving of attributes are important. Validation that attributes are
correct is one of the primary benefits over the previous approach
with doc-comments, where such validation is not possible.
=== Why not a stricter solution like Doctrine Annotations? ===
This RFC proposes only base PHP attribute functionality. A general solution for
PHP and the wider community must take different use-cases into account and the
full Doctrine like system is not necessary for a lot of use-cases, especially
the PHP internal use-cases.
=== Why are nested attributes not allowed? ===
Nesting attributes means, defining an attribute as an argument to another attribute. This is intentionally not allowed because it would mean an attribute can be declared within an argument, which at the moment this RFC defines as constant AST. A constant AST is already known and re-usable and describes a subset of expressions that can be used in property, constant or argument default value declarations.
Reusing constant AST concept for attribute arguments introduces consistency and the potential to benefit from future work on this concept. Allowing nested attributes would potentially cause a conflict in the future, as well as makes introducing complexity for users to understand this new context that behaves differently than other parts of the language.
=== Naming (attributes or annotations) ===
The name "Attributes" for this feature makes sense to avoid confusion with
annotations that are already used. With this distinction Doctrine Annotations
is implemented with either docblock (PHP 7) or attributes (PHP 8+).
===== Backward Incompatible Changes =====
None
===== Proposed PHP Version(s) =====
8.0
===== RFC Impact =====
==== To Core ====
Requirement to store attributes on every parsing token, ast nodes,
zend_class_entry, zend_class_constant, zend_op_array and zend_property_info
adds one additional pointer to each strucutre, even those that doesn't use attributes.
==== To SAPIs ====
None
==== To Existing Extensions ====
None
Opcache JIT will move to use Opcache\Jit instead of @jit and Opcache\Nojit
instead of @nojit attributes, but this is currently an unreleased feature.
==== To Opcache ====
opcache modifications are parts of the proposed patch, might not be working
100% after internal changes from original 7.1 patch to 8.0
==== New Constants ====
None
==== php.ini Defaults ====
None
===== Open Issues =====
None
===== Future Scope =====
* Integration with a potential named arguments proposal for function calls
* Opportunity to augment existing functionality with new behavior without breaking backwards compatibility. One example is introduction of a "//<>//" attribute that could be used to signal that a generator function creates a rewindable iterator.
* Add <> attribute that emits deprecation when function/method called, property or const accessed
* Other languages such as Go have simple but powerful serialization from XML/JSON to objects and back. The combination of typed properties an attributes puts this in reach for core or a PHP extension to implement.
* An alternative "short" syntax to declare attributes in one enclosing //<>// This could be revisited in the future similar to grouped use statements being added after use statements already existed.
* Extending userland attributes to allow declaring which target they are allowed to be declared on including validation of those targets in //ReflectionAttribute::newInstance()//.
===== Voting =====
* Yes
* No
Secondary vote (choice with the most votes is picked):
* <<>>
* @:
Vote closes on May 4th, 12:00 UTC.
===== Patches and Tests =====
Two patches that are based on each other, the second one implementing future scope and alternative syntax:
* https://github.com/beberlei/php-src/pull/2 (with //<<>>// syntax)
* https://github.com/kooldev/php-src/pull/2 (with //@:// syntax, including userland target validation)
===== References =====
* [[https://doc.rust-lang.org/reference/attributes.html|Rust Attributes]]
* [[https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/attributes/|C# Attributes]]
* [[https://en.wikipedia.org/wiki/Java_annotation|Java Annotation]]
* [[https://www.typescriptlang.org/docs/handbook/decorators.html|TypeScript/ECMAScript Decorators]]
* [[https://docs.microsoft.com/en-us/cpp/cpp/attributes?view=vs-2019|C++ Attributes]]
* [[https://golang.org/pkg/reflect/#StructTag|Go Tags]]
* [[https://docs.hhvm.com/hack/attributes/introduction|Attributes in Hack]]
Previously failed or abandoned RFCs
* [[https://wiki.php.net/rfc/attributes|Attributes v1]]
* [[https://wiki.php.net/rfc/annotations_v2|Annotations v2]]
* [[https://wiki.php.net/rfc/reflection_doccomment_annotations|Reflection Annotations using the Doc-Comment]]
* [[https://wiki.php.net/rfc/simple-annotations|Simple Annotations]]
* [[https://wiki.php.net/rfc/annotations-in-docblock|Annotations in DocBlock RFC]]
* [[https://wiki.php.net/rfc/annotations|Class Metadata RFC]]
===== Changelog =====
0.2:
* Added new, hopefully more simple userland example
* Changed //Php\Attribute// to //PhpAttribute// because the PHP namespace has not been reserved for PHP and is currently "empty"
* Clarify necessary order of docblocks, attributes and function declarations
* Clarify no conflict with potential generics syntax
0.3:
* Changed to support the same attribute multiple times on the same declaration
* Added support for attributes on method and function parameters
* Replaced //PhpAttribute// interface with an attribute instead
* Distiction between userland and compiler attributes and description when each of them gets evaluated/validated
* Reduce number of examples to shorten RFC a bit and expand the other examples instead
0.4:
* Changed validation of compiler attributes to use a C callback instead of instantiating object
* Offer alternative syntax "@:" using new token T_ATTRIBUTE
0.5:
* Rename ReflectionAttribute::getAsObject to ReflectionAttribute::newInstance