====== Suggestion for changes to PHP RFC: Property hooks ====== * Version: 0.1 * Date: 2024 * Author: Rowan Tommins (imsop@php.net) * Status: For information only, not a complete proposal * First Published at: http://wiki.php.net/rfc/property-hooks/imsop-suggestion * Proposal being referenced: http://wiki.php.net/rfc/property-hooks ===== Introduction ===== This is a detailed suggestion for changes to the scope and detail of the Property hooks RFC by Ilija Tovilo and Larry Garfield. I would like to stress that this is not a counter-proposal, or intended to imply criticism of that excellent RFC. Rather, it is an attempt to lay out my thoughts on how it could be tweaked, in a more accessible format than individual discussion posts. Much of the text and example code is taken from the current RFC, for which I give full credit. As well as incorporating some already-discussed changes, it attempts to reframe hooked properties as "virtual by default", making clearer that the backing field is something //added if necessary//, distinct from the storage of a "normal" (non-hooked) property. ===== Changes from Current RFC ===== For the most part, the below matches the functionality already proposed. It //removes// the following: * The ability to specify a type in the ''set'' hook. A property having one type for reads and a different one for writes is somewhat unintuitive, and allowing it restricts our freedom when defining the syntax for more straight-forward ''set'' hooks. * The ability to change the name of the value passed into the ''set'' hook. The alternative would be to //always// require it to be specified, but it's likely coding conventions would require it to always be ''$value'' anyway. Either way, I think the language should be opinionated here, and not try to please everyone. * The ability to reference the property's backing store with ''$this->propertyName''. Again, there is an alternative: as in the current RFC text, make it //only// available under this name. By allowing both names, I think the original RFC caused confusion regarding what this name actually refers to. * The doubly abbreviated form (''public string $fullName => $this->first . " " . $this->last;''). I think this is just too many ways to write the same thing, and is an exception to the otherwise clear rule that "For a property to use a hook, it must replace its trailing '';'' with a code block denoted by ''{ }''." * Most of the distinctions between "virtual" and "backed" properties. Most notably, specifying a ''get'' hook but no ''set'' hook would always imply that setting the property is an error, not that the property is implicitly backed with a default setter. It incorporates these changes which have been discussed elsewhere: * The abbreviated ''set'' hook syntax implies an assignment to the backing field. The version in the current RFC isn't actually shorter than the full braced form, and it's confusing to say both "the right-hand side must be a valid expression" and "the result of the expression is ignored". * Rename the special ''$field'' and ''$value'' variables to ''$_field'' and ''$_value'', to make them stand out as "special". I'm not sure about this, but wanted to see it in some examples to consider. * Allows assigning to array indexes on properties implementing an ''&get'' hook. I'm not sure if this is possible, but think it would be a useful addition if it is. Other changes: * Rather than being treated as ''void'', the ''set'' hook is required to meet the existing definition of a ''void'' function. This avoids any confusion about what happens to a return value. ===== Detailed Proposal ===== ==== Hooked Properties ==== A "hooked property" is declared by replacing the trailing '';'' in a property definition with a block denoted by ''{ }''. Inside the block are one or more hook implementations, for which the order is explicitly irrelevant. It is a compile error to have an empty hook block. Access to a hooked property is only possible via hooks defined inside the block. By default, "hooked properties" do not store any value; this is referred to as a "virtual property". However, they may make use of an implied "backing field", as discussed below. ==== get ==== The ''get'' hook is called every time the property is read from. If a ''get'' hook is not defined, attempting to read from the property will throw an Error at runtime. class User { public function __construct(private string $first, private string $last) {} public string $fullName { get { return $this->first . " " . $this->last; } } } $u = new User('Larry', 'Garfield'); // prints "Larry Garfield" print $u->fullName; The ''get'' hook body is an arbitrarily complex method body, which MUST return a value that is type compatible with the property. ==== set ==== The ''set'' hook is called every time the property is written to. if a ''set'' hook is not defined, attempting to write to the property will throw an Error at runtime. class User { public function __construct(private string $first, private string $last) {} public string $fullName { set { [$this->first, $this->last] = explode(' ', $_value); } } public function getFirst(): string { return $this->first; } } u = new User('Larry', 'Garfield'); $u->fullName = 'Ilija Tovilo'; // prints "Ilija" print $u->getFirst(); The ''set'' hook body is an arbitrarily complex method body, which must meet the requirements of a ''void'' function - i.e. it may use a ''return;'' statement to return early, but must not return a value. A statement such as ''return $_value;'' inside the ''set'' hook will give a compile-time error. Within the body, the special variable ''$_value'' refers to the value being assigned - e.g. in the above example, the ''set'' hook will be executed with ''$_value'' set to '' 'Ilija Tovilo' ''. Since the above example has no ''get'' hook, no read operation from ''$fullName'' is allowed and attempting to do so will throw an Error. This particular usage pattern is not common, but valid. More commonly, a virtual property will either be get only, or symmetric: class User { public function __construct(public string $first, public string $last) {} public string $fullName { get { return "$this->first $this->last"; } set { [$this->first, $this->last] = explode(' ', $_value); } } } u = new User('Larry', 'Garfield'); $u->fullName = 'Ilija Tovilo'; // prints "Ilija" print $u->first; // prints "Ilija Tovilo" print $u->fullName; [Description of assignment operator return values, beginning "Although it is...", omitted but still consistent with this proposed wording. Note however that the proceeding section on contravariant types is deliberately omitted, as I suggest this feature is removed.] ==== Backed Properties ==== The above examples all use "virtual properties", where no property-specific storage is required. However, one common use for hooked properties is expected to involve writing to a "backing field" for the property: class User { private string $internalUsername; public string $username { get { return $this->internalUsername; } set { if (strlen($_value) > 10) throw new \InvalidArgumentException('Too long'); $this->internalUsername = strtolower($_value); } } } $u = new User(); $u->username = "Crell"; // the set hook is called, and saves the value to a separate private property print $u->username; // prints "crell", the get hook is called and returns the value in the private property $u->username = "something_very_long"; // the set hook throws \InvalidArgumentException. We expect this “validate on set” use case to be particularly common. To facilitate this and similar use cases, the special variable ''$_field'' is available in both ''get'' and ''set'' hooks, which is a hidden "backing field" with the type specified for the property. Space for this backing field is allocated only if either the ''get'' or ''set'' hook makes use of it. Be aware, the detection logic works on ''$_field'' directly at compile time, not on dynamic forms of it like ''$var = '_field'; $$var''. That will not create a backing field. If a backing field is created, it will be visible to reflection and serialisation functions as the "raw value" of the property. It is otherwise accessible only from inside the hooks on the property where it is defined. The previous example can therefore be rewritten to not need the extra private property: class User { public string $username { get { return $_field; } set { if (strlen($_value) > 10) throw new \InvalidArgumentException('Too long'); $_field = strtolower($_value); } } } $u = new User(); $u->username = "Crell"; // the set hook is called, and saves the value to the backing field print $u->username; // prints "crell", the get hook is called and returns the value in the backing field $u->username = "something_very_long"; // the set hook throws \InvalidArgumentException. ==== Abbreviated forms ==== Both ''get'' and ''set'' hooks offer an abbreviated form: if a hook's body is a single expression, then the ''{ }'' may be omitted and replaced with ''=>'', just like with arrow functions. For ''get'', the expression to the right of the ''=>'' is the return value for the hook. For ''set'', the expression to the right of the ''=>'' is the value to be saved to the hidden backing field (which is allocated as though ''$_field'' was directly mentioned). That means the following two examples are semantically identical: class User { public function __construct(private string $first, private string $last) {} public string $fullName { get { return $this->first . " " . $this->last; } } public int $age { get { return $_field; } set { $_field = min(0, $_value); } } } class User { public function __construct(private string $first, private string $last) {} public string $fullName { get => $this->first . " " . $this->last; } public int $age { get => $_field; set => min(0, $_value); } } ==== References ==== Because the presence of hooks intercept the read and write process for properties, they cause issues when acquiring a reference to a property or with indirect modification (such as ''$this->arrayProp['key'] = 'value';''). A hooked property will normally disallow acquiring a reference to a property or indirect modification on a property, even if both ''get'' and ''set'' are defined. For example: class Foo { public string $bar; public string $baz { get => $this->baz; set => $this->baz = strtoupper($value); } } $x = 'beep'; $foo = new Foo(); // This is fine; as $bar is a normal property. $foo->bar = &$x; // This will error, as $baz is a // hooked property and so references are not allowed. $foo->baz = &$x; For the vast majority of properties this causes no issue, as reading or writing to properties by reference is extremely rare. However, there are some edge cases where references to properties are necessary. Arguably, the lack of support for references would technically make adding hooks to a property a nominal BC break for that class, but property references are so rare that we consider it acceptable, especially with the addition below. In cases where the user considers it acceptable to bypass the ''set'' hook using indirect modification, the ''get'' hook can be replaced by an ''&get'' hook. This works identically to a normal ''get'' hook, but the body must meet the normal requirements for returning by reference. Subsequent writes to that reference will not trigger the ''set'' hook. That means the following is allowed: class Foo { public string $baz { &get => $_field; set => strtoupper($_value); } } $foo = new Foo(); // This invokes "set", and sets the backing field to "BEEP". $foo->baz = 'beep'; // This assigns $x to be a reference directly to the backing field $x =& $foo->baz; // This assigns "boop" to the backing field, bypassing the set hook. $x = 'boop'; Setting by reference, however, is not supported, as its semantics are not well-defined. This behavior mirrors how the magic methods __get() and __set() handle references. (They are, in a sense, generic virtual properties.) ==== Arrays ==== [To be confirmed] The ''&get'' hook will also be called, if defined, when indirectly accessing array indexes. Effectively, $x->foo['bar'] = 42; is treated as equivalent to $_temp =& $x->foo; $_temp['bar'] = 42; unset($_temp); For example: class Example { // non-virtual property, using a get hook for additional behaviour, not to reroute the value public array $foo { &get { $_field ??= $this->lazyLoad('foo'); return $_field; } } // ... } $a = new Example; $a->foo[] = 42; // will call $a->lazyLoad('foo') to populate the initial value, then append an item to it