rfc:property-hooks:imsop-suggestion

Suggestion for changes to PHP RFC: Property hooks

Introduction

This is a detailed suggestion for changes to the scope and detail of the Property hooks RFC by Ilija Tovilo and Larry Garfield. I would like to stress that this is not a counter-proposal, or intended to imply criticism of that excellent RFC. Rather, it is an attempt to lay out my thoughts on how it could be tweaked, in a more accessible format than individual discussion posts.

Much of the text and example code is taken from the current RFC, for which I give full credit.

As well as incorporating some already-discussed changes, it attempts to reframe hooked properties as “virtual by default”, making clearer that the backing field is something added if necessary, distinct from the storage of a “normal” (non-hooked) property.

Changes from Current RFC

For the most part, the below matches the functionality already proposed. It removes the following:

  • The ability to specify a type in the set hook. A property having one type for reads and a different one for writes is somewhat unintuitive, and allowing it restricts our freedom when defining the syntax for more straight-forward set hooks.
  • The ability to change the name of the value passed into the set hook. The alternative would be to always require it to be specified, but it's likely coding conventions would require it to always be $value anyway. Either way, I think the language should be opinionated here, and not try to please everyone.
  • The ability to reference the property's backing store with $this->propertyName. Again, there is an alternative: as in the current RFC text, make it only available under this name. By allowing both names, I think the original RFC caused confusion regarding what this name actually refers to.
  • The doubly abbreviated form (public string $fullName => $this->first . “ ” . $this->last;). I think this is just too many ways to write the same thing, and is an exception to the otherwise clear rule that “For a property to use a hook, it must replace its trailing ; with a code block denoted by { }.”
  • Most of the distinctions between “virtual” and “backed” properties. Most notably, specifying a get hook but no set hook would always imply that setting the property is an error, not that the property is implicitly backed with a default setter.

It incorporates these changes which have been discussed elsewhere:

  • The abbreviated set hook syntax implies an assignment to the backing field. The version in the current RFC isn't actually shorter than the full braced form, and it's confusing to say both “the right-hand side must be a valid expression” and “the result of the expression is ignored”.
  • Rename the special $field and $value variables to $_field and $_value, to make them stand out as “special”. I'm not sure about this, but wanted to see it in some examples to consider.
  • Allows assigning to array indexes on properties implementing an &get hook. I'm not sure if this is possible, but think it would be a useful addition if it is.

Other changes:

  • Rather than being treated as void, the set hook is required to meet the existing definition of a void function. This avoids any confusion about what happens to a return value.

Detailed Proposal

Hooked Properties

A “hooked property” is declared by replacing the trailing ; in a property definition with a block denoted by { }. Inside the block are one or more hook implementations, for which the order is explicitly irrelevant. It is a compile error to have an empty hook block.

Access to a hooked property is only possible via hooks defined inside the block.

By default, “hooked properties” do not store any value; this is referred to as a “virtual property”. However, they may make use of an implied “backing field”, as discussed below.

get

The get hook is called every time the property is read from. If a get hook is not defined, attempting to read from the property will throw an Error at runtime.

class User
{
    public function __construct(private string $first, private string $last) {}
 
    public string $fullName {
        get { 
            return $this->first . " " . $this->last;
        }
    }
}
 
$u = new User('Larry', 'Garfield');
 
// prints "Larry Garfield"
print $u->fullName;

The get hook body is an arbitrarily complex method body, which MUST return a value that is type compatible with the property.

set

The set hook is called every time the property is written to. if a set hook is not defined, attempting to write to the property will throw an Error at runtime.

class User
{
    public function __construct(private string $first, private string $last) {}
 
    public string $fullName {
        set {
            [$this->first, $this->last] = explode(' ', $_value);
        }
    }
 
    public function getFirst(): string {
        return $this->first;
    }
}
 
u = new User('Larry', 'Garfield');
 
$u->fullName = 'Ilija Tovilo';
 
// prints "Ilija"
print $u->getFirst();

The set hook body is an arbitrarily complex method body, which must meet the requirements of a void function - i.e. it may use a return; statement to return early, but must not return a value. A statement such as return $_value; inside the set hook will give a compile-time error.

Within the body, the special variable $_value refers to the value being assigned - e.g. in the above example, the set hook will be executed with $_value set to 'Ilija Tovilo' .

Since the above example has no get hook, no read operation from $fullName is allowed and attempting to do so will throw an Error. This particular usage pattern is not common, but valid.

More commonly, a virtual property will either be get only, or symmetric:

class User
{
    public function __construct(public string $first, public string $last) {}
 
    public string $fullName {
        get {
            return "$this->first $this->last";
        }
        set {
            [$this->first, $this->last] = explode(' ', $_value);
        }
    }
 
}
 
u = new User('Larry', 'Garfield');
 
$u->fullName = 'Ilija Tovilo';
 
// prints "Ilija"
print $u->first;
 
// prints "Ilija Tovilo"
print $u->fullName;

[Description of assignment operator return values, beginning “Although it is...”, omitted but still consistent with this proposed wording. Note however that the proceeding section on contravariant types is deliberately omitted, as I suggest this feature is removed.]

Backed Properties

The above examples all use “virtual properties”, where no property-specific storage is required. However, one common use for hooked properties is expected to involve writing to a “backing field” for the property:

class User {
    private string $internalUsername;
    public string $username {
        get {
            return $this->internalUsername;
        }
        set {
            if (strlen($_value) > 10) throw new \InvalidArgumentException('Too long');
            $this->internalUsername = strtolower($_value);
        }
    }
}
 
$u = new User();
$u->username = "Crell"; // the set hook is called, and saves the value to a separate private property
print $u->username; // prints "crell", the get hook is called and returns the value in the private property
 
$u->username = "something_very_long"; // the set hook throws \InvalidArgumentException.

We expect this “validate on set” use case to be particularly common.

To facilitate this and similar use cases, the special variable $_field is available in both get and set hooks, which is a hidden “backing field” with the type specified for the property.

Space for this backing field is allocated only if either the get or set hook makes use of it. Be aware, the detection logic works on $_field directly at compile time, not on dynamic forms of it like $var = '_field'; $$var. That will not create a backing field.

If a backing field is created, it will be visible to reflection and serialisation functions as the “raw value” of the property. It is otherwise accessible only from inside the hooks on the property where it is defined.

The previous example can therefore be rewritten to not need the extra private property:

class User {
    public string $username {
        get {
            return $_field;
        }
        set {
            if (strlen($_value) > 10) throw new \InvalidArgumentException('Too long');
            $_field = strtolower($_value);
        }
    }
}
 
$u = new User();
$u->username = "Crell"; // the set hook is called, and saves the value to the backing field
print $u->username; // prints "crell", the get hook is called and returns the value in the backing field
 
$u->username = "something_very_long"; // the set hook throws \InvalidArgumentException.

Abbreviated forms

Both get and set hooks offer an abbreviated form: if a hook's body is a single expression, then the { } may be omitted and replaced with =>, just like with arrow functions.

For get, the expression to the right of the => is the return value for the hook. For set, the expression to the right of the => is the value to be saved to the hidden backing field (which is allocated as though $_field was directly mentioned).

That means the following two examples are semantically identical:

class User
{
    public function __construct(private string $first, private string $last) {}
 
    public string $fullName {
        get { 
            return $this->first . " " . $this->last;
        }
    }
    public int $age {
        get {
            return $_field;
        }
        set {
            $_field = min(0, $_value);
        }
    }
}
 
class User
{
    public function __construct(private string $first, private string $last) {}
 
    public string $fullName {
        get => $this->first . " " . $this->last;
    }
    public int $age {
        get => $_field;
        set => min(0, $_value);
    }
}

References

Because the presence of hooks intercept the read and write process for properties, they cause issues when acquiring a reference to a property or with indirect modification (such as $this->arrayProp['key'] = 'value';).

A hooked property will normally disallow acquiring a reference to a property or indirect modification on a property, even if both get and set are defined.

For example:

class Foo
{
    public string $bar;
 
    public string $baz {
        get => $this->baz;
        set => $this->baz = strtoupper($value);
    }
}
 
$x = 'beep';
 
$foo = new Foo();
// This is fine; as $bar is a normal property.
$foo->bar = &$x;
 
// This will error, as $baz is a 
// hooked property and so references are not allowed.
$foo->baz = &$x;

For the vast majority of properties this causes no issue, as reading or writing to properties by reference is extremely rare. However, there are some edge cases where references to properties are necessary. Arguably, the lack of support for references would technically make adding hooks to a property a nominal BC break for that class, but property references are so rare that we consider it acceptable, especially with the addition below.

In cases where the user considers it acceptable to bypass the set hook using indirect modification, the get hook can be replaced by an &get hook. This works identically to a normal get hook, but the body must meet the normal requirements for returning by reference. Subsequent writes to that reference will not trigger the set hook.

That means the following is allowed:

class Foo
{
    public string $baz {
        &get => $_field;
        set => strtoupper($_value);
    }
}
 
$foo = new Foo();
 
// This invokes "set", and sets the backing field to "BEEP".
$foo->baz = 'beep';
 
// This assigns $x to be a reference directly to the backing field
$x =& $foo->baz;
 
// This assigns "boop" to the backing field, bypassing the set hook.
$x = 'boop';

Setting by reference, however, is not supported, as its semantics are not well-defined. This behavior mirrors how the magic methods get() and set() handle references. (They are, in a sense, generic virtual properties.)

Arrays

[To be confirmed]

The &get hook will also be called, if defined, when indirectly accessing array indexes. Effectively, $x->foo['bar'] = 42; is treated as equivalent to $_temp =& $x->foo; $_temp['bar'] = 42; unset($_temp);

For example:

class Example {
    // non-virtual property, using a get hook for additional behaviour, not to reroute the value
    public array $foo {
         &get { $_field ??= $this->lazyLoad('foo'); return $_field; }
    }
    // ...
}
$a = new Example;
$a->foo[] = 42; // will call $a->lazyLoad('foo') to populate the initial value, then append an item to it
rfc/property-hooks/imsop-suggestion.txt · Last modified: 2024/03/04 17:06 by theodorejb