rfc:attributes

PHP RFC: Attributes

Introduction

Attributes (or annotations) are a form of syntactic metadata that can be added to language classes, functions, etc. PHP offers only a single form of such metadata - doc-comments. This is just a string and to keep some structured information, we had to use some pseudo-language. Then we have to parse it to access a particular element of that structure.

Many languages like Java, C#, Hack, etc provide a simpler way. They allow the definition of structured meta-information through small syntax extension.

Proposal

Attribute Syntax

An attribute is a specially formatted text enclosed with "<<" and ">>". Attributes may be applied to functions, classes, interfaces, traits, methods, properties and class constants. In the same way as doc-comments, attributes must be placed before the corresponding definition, but it's possible to define more than one attribute on the same declaration.

<<...>>
<<...>>
function foo() {}

Each attribute definition construct may also define one or more named attribute, which may be used with no value, a single value or multiple values. See the EBNF grammar:

<attribute> ::= "<<" <namespace-name> [ "(" <value> { "," <value> } ")" ]
                { "," <namespace-name> [ "(" <value> { "," <value> } ")" ] } ">>".
<namespace-name>      ::= STRING.
<value>     ::= <php-constant> | <php-expression>.

And Example:

<<WithoutValue, SingleValue(0), FewValues('Hello', 'World')>>
function foo() {}

It's not possible to use the same attribute name more than once on the same definition, however, it's possible to use multiple attribute values associated with this name.

<<test(1),test(2)>> // Error
function foo() {}
 
<<test(1,2)>> // Works
function foo() {}

Arbitrary PHP Expressions as Attribute Values (AST attributes)

Other than simple scalars, attribute values may be represented with any valid PHP expression.

<<test($a + $b > 0)>>
function foo($a, $b) {
}

In this case, internally, the value of the attribute is kept as an Abstract Syntax Tree, and the user will have the ability to read every individual node of this tree separately. This approach implies the usage of the same PHP syntax for meta data and eliminates the need for a separate parser.

The native usage of an AST is not especially necessary. It's also possible to use plain strings and then transform them into AST at user level, through the php-ast extension.

<<test("$a + $b > 0")>>
function foo($a, $b) {
}
$r = new ReflectionFunction("foo");
$ast = ast\parse_code($r->getAttributes()["test"][0]);

Reflection

Reflection classes are extended with the getAttributes() methods, and return array of attributes.

function ReflectionFunction::getAttributes(): array;
function ReflectionClass::getAttributes(): array;
function ReflectionProperty::getAttributes(): array;
function ReflectionClassConstant::getAttributes(): array;

These functions return empty array if there were no attributes defined. Otherwise, they return an array with attribute names as keys and nested arrays as the corresponding values. Attributes without values represented by empty arrays, attributes with single value by arrays with a single element, etc.

<<WithoutValue, SingleValue(0), FewValues('Hello', 'World')>>
function foo() {}
$r = new ReflectionFunction("foo");
var_dump($r->getAttributes());
array(3) {
  ["WithoutValue"]=>
  array(0) {
  }
  ["SingleValue"]=>
  array(1) {
    [0]=>
    int(0)
  }
  ["FewValues"]=>
  array(2) {
    [0]=>
    string(5) "Hello"
    [1]=>
    string(5) "World"
  }
}

AST Representation

While internally AST is stored in native zend_ast format, Reflection*::getAttributes() methods return the corresponding representation built with objects of \ast\Node and \ast\Node\Decl classes, borrowed from php-ast. These classes moved onto PHP core may be used even without php-ast extension. However, it also defines useful constants and functions, that would simplify work with AST in PHP.

<<test($a + $b > 0)>>
function foo($a, $b) {
}
$r = new ReflectionFunction("foo");
var_dump($r->getAttributes());
array(1) {
  ["test"]=>
  array(1) {
    [0]=>
    object(ast\Node)#2 (4) {
      ["kind"]=>
      int(521)
      ["flags"]=>
      int(0)
      ["lineno"]=>
      int(0)
      ["children"]=>
      array(2) {
        [0]=>
        object(ast\Node)#3 (4) {
          ["kind"]=>
          int(520)
          ["flags"]=>
          int(1)
          ["lineno"]=>
          int(0)
          ["children"]=>
          array(2) {
            [0]=>
            object(ast\Node)#4 (4) {
              ["kind"]=>
              int(256)
              ["flags"]=>
              int(0)
              ["lineno"]=>
              int(0)
              ["children"]=>
              array(1) {
                [0]=>
                string(1) "a"
              }
            }
            [1]=>
            object(ast\Node)#5 (4) {
              ["kind"]=>
              int(256)
              ["flags"]=>
              int(0)
              ["lineno"]=>
              int(0)
              ["children"]=>
              array(1) {
                [0]=>
                string(1) "b"
              }
            }
          }
        }
        [1]=>
        int(0)
      }
    }
  }
}

php-ast is also going to be included into core PHP distribution, but this is a subject of another RFC.

Use Cases

With attributes, it's extremely simple to mark some functions with some specific “flag” and then perform checks and special handling in extensions.

<<inline>>
function add(int $a, $int $b): int {
  return $a + $b;
}
 
<<jit>>
function foo() {
  ...
}

Attributes may be used as a base level for an annotation system similar to Doctrine, where each attribute is represented by an object of corresponding class that perform validation and other actions.

<?php
namespace Doctrine\ORM {
 
	class Entity {
		private $name;
		public function __construct($name) {
			$this->name = $name;
		}
	}
 
	function GetClassAttributes($class_name) {
		$reflClass = new \ReflectionClass($class_name);
		$attrs = $reflClass->getAttributes();
		foreach ($attrs as $name => &$values) {
			$name = "Doctrine\\" . $name;
			$values = new $name(...$values);
		}
		return $attrs;
	}
}
 
namespace {
	<<ORM\Entity("user")>>
	class User {}
 
	var_dump(Doctrine\ORM\GetClassAttributes("User"));
}
?>
array(1) {
  ["ORM\Entity"]=>
  object(Doctrine\ORM\Entity)#2 (1) {
    ["name":"Doctrine\ORM\Entity":private]=>
    string(4) "user"
  }
}

Attributes with AST values may be used to implement “Design by Contract” and other verification paradigms as PHP extensions.

<<requires(
    $a >= 0,
    $b >= 0,
    $c >= 0,
    $a <= ($b+$c),
    $b <= ($a+$c),
    $c <= ($a+$b))>>
<<ensures(RET >= 0)>>
function triangleArea($a, $b, $c)
{
  $halfPerimeter = ($a + $b + $c) / 2;
 
  return sqrt($halfPerimeter
	* ($halfPerimeter - $a)
	* ($halfPerimeter - $b)
	* ($halfPerimeter - $c));
}

Special Attributes

Attribute names starting with "__" are reserved for internal purpose. Usage of unknown special attributes leads to compile-time error. Currently, no any special attributes are defined.

Criticism and Alternative Approaches

Doc-comments

Today we are using single doc-comments for any kind of meta-information, and many people don't see a benefit in the introduction of a special syntax. Everything may be grouped together and formatted using another special language.

/**
* Compute area of a triangle
*
* This function computes the area of a triangle using Heron's formula.
*
* @param number $a Length of 1st side
* @requires ($a >= 0)
* @param number $b Length of 2nd side
* @requires ($b >= 0)
* @param number $c Length of 3rd side
* @requires ($c >= 0)
* @requires ($a <= ($b+$c))
* @requires ($b <= ($a+$c))
* @requires ($c <= ($a+$b))
*
* @return number The triangle area
* @ensures (RET >= 0)
*
* @jit
*/
 
function triangleArea($a, $b, $c)
{
  $halfPerimeter = ($a + $b + $c) / 2;
 
  return sqrt($halfPerimeter
	* ($halfPerimeter - $a)
	* ($halfPerimeter - $b)
	* ($halfPerimeter - $c));
}

This approach works, but PHP itself doesn't have efficient access to pieces of this information. e.g. to check “jit” attribute, today, we would perform regular expression matching.

It might be possible to make PHP parse existing doc-comments and keep information as structured attributes, but we would need to invoke additional parser for each doc-comment; doc-comment may not conform to context-grammar and we have to decide what to do with grammar errors; finally this is going to be another language inside PHP.

This RFC proposes only base PHP attribute functionality. It doesn't define how attributes are validated and used. The full-featured annotation systems may be implemented on top of the base. The following example shows how a real life doc-comment annotation taken from Drupal may be implemented, validated and constructed on top of PHP attributes.

/**
  * @Block(
  *   id = "system_branding_block",
  *   admin_label = @Translation("Site branding")
  * )
  */
 
<<Drupal(@Block([
       "id" => "system_branding_block",
       "admin_label" => @Translation("Site branding")
]))>>
class PageTitleBlock {
} 
 
function TranslateDrupalAttribute($value) {
  if ($value instanceof \ast\Node) {
    if ($value->kind == 264 && count($value->children) == 1) { // '@'
      $a = $value->children[0];
      if (is_string($a) && class_exists($a)) {
        $value = new $a;
      } else if ($a instanceof \ast\Node &&
                 $a->kind == 515 && // NAME(ARGS)
                 count($a->children) == 2 &&
                 is_string($a->children[0]) &&
                 class_exists($a->children[0]) &&
                 $a->children[1] instanceof \ast\Node &&
                 $a->children[1]->kind == 128 &&
                 count($a->children[1]->children) == 1) {
        $args = $a->children[1]->children[0];
        if ($args instanceof ast\Node && $args->kind == 130) {
          $obj = new $a->children[0];
          foreach ($args->children as $arg) {
            if ($arg instanceof ast\Node &&
                $arg->kind == 525 &&
                count($arg->children) == 2 && 
                is_string($arg->children[1])) {
              $name = $arg->children[1];
              $val = $arg->children[0];
              if ($val instanceof ast\Node) {
                $obj->{$name} = TranslateDrupalAttribute($val);
              } else {
                $obj->{$name} = $val;
              }	
            } else {
              throw DrupalAnnotationError("...");
            }
          }
        } else {
          $name = $a->children[0];
          $obj = new $name($args);
        }
        $value = $obj;
      } else {
        throw DrupalAnnotationError("...");
      }
    } else {
      throw DrupalAnnotationError("...");
    }		
  }
  return $value;
}
 
function GetDrupalAnnotations($class_name) {
  $reflClass = new \ReflectionClass($class_name);
  $attrs = $reflClass->getAttributes();
  $ret = [];
  foreach ($attrs as $name => $values) {
    if ($name == "Drupal") {
      foreach ($values as &$value) {
        $ret[] = TranslateDrupalAttribute($value);
      }
    }
  }
  return $ret;
}
 
class Block {}
class Translation {
  public $text;
  function __construct($text) {
    $this->text = $text;
  }
}
 
var_dump(GetDrupalAnnotations("PageTitleBlock"));
array(1) {
  [0]=>
  object(Block)#11 (2) {
    ["id"]=>
    string(21) "system_branding_block"
    ["admin_label"]=>
    object(Translation)#12 (1) {
      ["text"]=>
      string(13) "Site branding"
    }
  }
}

'@' Prefix in Attribute Names

'@' symbol may be used in attribute values (as part of PHP expressions) and reused by annotation system for special purpose, but attribute names can't be prefixed with '@' their selves. See the example above.

Naming (attributes or annotations)

Different programming languages use different terms for similar features. Some use annotation, some attributes. I prefer name “attributes” because it's used in Hack and makes less fragmentation. It also makes less confusion for external high-level annotation systems (Doctrine, etc).

Backward Incompatible Changes

The RFC doesn't make backward incompatibility changes, however, it makes forward incompatibility change. This means that frameworks that use native attributes won't be able to run on PHP versions lower than 7.1.

Proposed PHP Version(s)

7.1

RFC Impact

To SAPIs

None

To Existing Extensions

php-ast will require minor modification, because the patch moved classes “\ast\Node” and “\ast\Node\Decl” into core.

To Opcache

opcache modifications are parts of the proposed patch.

New Constants

None. However, we may move some constants from php-ast into core.

php.ini Defaults

None.

Open Issues

  • part of patch related to new AST classes (zend_ast.*) might need to be slightly changed to satisfy need of attributes and php-ast in best way.
  • getAttributes() should return empty array in case of no attributes [INCLUDED]
  • For each defined attribute getArray() should return a numerically indexed array independently of number of associated values. For attributes without values it should return empty arrays. [INCLUDED]
  • Attribute names might be namespace qualified e.g. <<\Foo\Bar>> [INCLUDED]
  • It may be useful to optionally allow some extra special character e.g. <<@\Foo\Bar>>. This character won't have any special meaning for PHP itself, but higher layer may use this “@” as a flag of special meaning. [ADDED into criticism section]
  • May be we don't need special functionality for AST in attributes. We may store attribute as a simple strings and then get them through getAttributes() and call ast\parse_code() to get AST (if necessary). Both enabling and disabling native AST support make sense with their profs and cons. [ADDITIONAL VOTING QUESTION]
  • Naming: “Attributes” or “Annotation(s)”? [ADDED into criticism section]

Proposed Voting Choices

The voting started on May 10th, 2016 and will close on May 24th, 2016.

Accept PHP Attributes? (2/3+1 majority required)
Real name Yes No
ajf (ajf)  
beberlei (beberlei)  
bishop (bishop)  
colinodell (colinodell)  
danack (danack)  
daverandom (daverandom)  
davey (davey)  
dmitry (dmitry)  
francois (francois)  
galvao (galvao)  
guilhermeblanco (guilhermeblanco)  
hywan (hywan)  
jwage (jwage)  
kalle (kalle)  
kguest (kguest)  
kinncj (kinncj)  
klaussilveira (klaussilveira)  
krakjoe (krakjoe)  
levim (levim)  
lstrojny (lstrojny)  
malukenho (malukenho)  
marcio (marcio)  
mariano (mariano)  
mgocobachi (mgocobachi)  
mike (mike)  
ocramius (ocramius)  
pierrick (pierrick)  
pollita (pollita)  
salathe (salathe)  
santiagolizardo (santiagolizardo)  
sebastian (sebastian)  
seld (seld)  
stas (stas)  
svpernova09 (svpernova09)  
theseer (theseer)  
zimt (zimt)  
Final result: 14 22
This poll has been closed.

What may be used as attribute value? (simple majority wins)
Real name valid PHP expression (internally represented as AST) valid PHP constant (number or string)
ajf (ajf)  
bwoebi (bwoebi)  
colinodell (colinodell)  
danack (danack)  
daverandom (daverandom)  
derick (derick)  
dmitry (dmitry)  
galvao (galvao)  
hywan (hywan)  
kguest (kguest)  
klaussilveira (klaussilveira)  
krakjoe (krakjoe)  
levim (levim)  
lstrojny (lstrojny)  
malukenho (malukenho)  
marcio (marcio)  
mariano (mariano)  
mgocobachi (mgocobachi)  
mike (mike)  
ocramius (ocramius)  
pierrick (pierrick)  
pollita (pollita)  
salathe (salathe)  
santiagolizardo (santiagolizardo)  
seld (seld)  
stas (stas)  
trowski (trowski)  
zimt (zimt)  
Final result: 11 17
This poll has been closed.

Patches and Tests

Implementation

After the project is implemented, this section should contain

  1. the version(s) it was merged to
  2. a link to the git commit(s)
  3. a link to the PHP manual entry for the feature

References

rfc/attributes.txt · Last modified: 2017/09/22 13:28 by 127.0.0.1