PHP RFC: Harmonise "untyped" and "typed" properties
- Version: 0.5
- Date: 2023-11-20
- Author: Rowan Tommins imsop@php.net
- Status: Draft
- First Published at: http://wiki.php.net/rfc/mixed_vs_untyped_properties
Introduction
This RFC proposes to remove or at least minimise the distinction between “typed” and “untyped” properties, by treating any property with no type information as though it was declared mixed
. This is primarily aimed to reduce confusion around different states, error messages, and behaviours.
PHP currently has three primary ways of adding properties to an object:
- Dynamically. The property is created automatically on a specific instance when it is first assigned to, and can be deleted completely using
unset
. - Declared, with optional visibility. The property is part of the class definition, and allocated on every instance, even if it is never assigned. The behaviour of
unset
is complex, hiding but not fully deleting the property. - Declared with a type. In addition to being allocated on every instance, the property is covered by extra guards on assignment to guarantee its type. If it is never assigned a value, or passed to
unset
, it is assigned a special “uninitialized” state.
The different behaviours of these properties are largely a result of the history of the language, rather than a consistent design. In particular, with the addition of the ''mixed'' type, it would seem logical for private $foo;
to be short-hand for private mixed $foo;
, since no type-guards are needed; but this is not currently the case.
Current Behaviour
Initial State and Unset
The three types of property vary in their initial state, and in their state after calling unset
, as can be seen in these three demos, which run the same code with the three types of property: dynamic property, untyped property, typed property.
The states can be summarised in this table:
Property Declaration | Initial state | After assignment | After unset | After re-assignment |
---|---|---|---|---|
#[AllowDynamicProperties] | Undefined | Defined, public | Undefined | Defined, public |
private $foo; | null | Defined, private | ? | Defined, private |
private mixed $foo; | Uninitialized | Defined, private | Uninitialized | Defined, private |
Where:
- “Defined” is straight-forward: the property exists, and can be read subject to visibility constraints.
- “Undefined” means the property does not exist on the object. It does not show up in views such as
var_dump
. Attempting to read it currently produces a Warning, but will produce an Error in PHP 9.0. - “Uninitialized” is a special state introduced as part of the introduction of typed properties to handle cases where neither
null
nor an inline initializer can be used. The property is still treated as present on the instance, but with a special value/state; in the example, thevar_dump
output shows this as[“foo”:“A”:private] => uninitialized(mixed)
The state marked “?”, for untyped properties after unset
, is a complex one:
- In output such as
var_dump
, it is not listed, as with “undefined” - *Reading* it gives a Warning (and future Error) of “Undefined property”
- Reading or writing still obeys the original visibility constraint - reading an
unset
private property from outside the class gives “Cannot access private property”, not “Undefined property” - Writing to it does not give the deprecation notice for “Creation of dynamic property”; instead, the original declaration (including any visibility specifier) is silently re-used
Dynamic properties will be prohibited on most classes in 9.0, giving the following if we don't make other changes:
var_dump output | Error on read | Error on write | |
---|---|---|---|
Not declared on class | Not shown | “Undefined property” | “Creation of dynamic property” (unless on stdClass or with #[AllowDynamicProperties] ) |
Declared then unset | Not shown | “Undefined property” | None |
Typed and uninitialized | uninitialized(mixed) | “must not be accessed before initialization” | None |
Variance under inheritance
In a “gradual typing” system such as PHP's, any type that is unspecified is usually analysed as though it has the widest possible type for that position. The language itself makes such an analysis for enforcing correct variance rules in inheritance:
- A method parameter with no type specified is equivalent to
mixed
. As input is contravariant, sub-classes cannot change to any narrower type, but can freely add or omit themixed
keyword. - A method which specifies no return type may return any type, or the pseudo-types
void
andnever
; conceptually, it returnsmixed|void|never
(although that union cannot be specified explicitly). As return types are covariant, sub-classes can change to any narrower type, includingmixed
; but omitting the type if the parent specified it would be widening the type tomixed|void|never
, so is not allowed.
Properties are invariant (as they can be both written to and read from), so sub-classes must declare them with an exactly equivalent type. As with parameters, the widest possible type is mixed
, but this is not currently considered equivalent.
class A { /** @var mixed $untyped */ public $untyped; public mixed $mixed; /** @param mixed $a */ public function acceptsUntyped($a) {} public function acceptsMixed(mixed $a) {} public function returnsMixed(): mixed {} /** @return mixed|void|never */ public function returnsUntyped() {} } class B extends A { # Not Allowed: considers "mixed" to be distinct from "untyped" public mixed $untyped; # Also Not Allowed public $mixed; # Allowed: unspecified parameter type is implicitly "mixed", so no variance occurs public function acceptsUntyped(mixed $a) {} # Also Allowed public function acceptsMixed($a) {} # Not Allowed: widens return type from "mixed" to implicit "mixed|void|never" public function returnsMixed() {} # Allowed: narrows return type from implicit "mixed|void|never" to explicit "mixed" # explicit return types of "void" and "never" can also be used here public function returnsUntyped(): mixed {} }
Proposal
Calling ''unset'' on any declared property will result in "uninitialized"
The behaviour of unset
will be standardised for all declared properties, regardless of whether a type was included in the declaration.
For an untyped property, this means:
- The property will show up in
var_dump
and similar output, with a special type of “uninitialized” - The error given when attempting to access the variable will refer to it as “uninitialized” rather than “undefined”
For typed properties, the current error message reads:
Typed property %s::%s must not be accessed before initialization
This will be changed, for both typed and untyped properties, to:
Property %s::%s must not be accessed before initialization
Properties with no declared type will be analysed as ''mixed''
If no type is specified for a property, its type will be analysed as mixed
, as is the case with parameters.
Consequently, the following code will be valid:
class Parent { public $foo; public mixed $bar; } class Child extends Parent { public mixed $foo; public $bar; }
Properties with no declared type will continue to default to ''null''
Whereas we are already committed to introducing an error for accessing an unset
property, accessing a property without explicitly initialising it is probably very common. As such, the proposal is to keep the difference in initial value: if a property has neither a type nor an initializer, treat it as though it had a type of mixed
and an initializer of null
.
In other words, given the following class:
class A { public $foo; private $bar; protected $other = 42; }
Act as though this was specified:
class A { public mixed $foo = null; private mixed $bar = null; protected mixed $other = 42; }
Note that this means adding the keyword mixed
to a declaration may still change the behaviour of a program, since it will change the initial state to “uninitialized”; but it retains the behaviour of all existing code without any action from users.
Backward Incompatible Changes
It is possible to write code relying directly on the current behaviour of unset
. However, note that undefined_property_error_promotion already commits us to changing direct access to such a property from a Warning to an Error.
Proposed PHP Version(s)
Since the new unset
behaviour produces a fatal error for previously valid code, the change will be in the next major version, i.e. PHP 9.0
RFC Impact
To Opcache
No new functionality is added in the final proposed state. Indeed, some edge cases that currently need to be handled may be removed.
To Reflection
TODO Reflection currently shows the implicit = null
on properties, but distinguishes between “no type” and mixed
in parameters, even though they are analysed as equivalent in variance checks. Should we change this? Was this discussed when mixed
was introduced?
Open Issues
- Are there other differences between typed and untyped properties to address?
- Is there a better compromise than keeping the implicit
= null
forever?
Unaffected PHP Functionality
- The behaviour of dynamic properties, that is those not defined at all in the class definition, is not changed by this RFC.
- The interaction of
unset
and magic__get
is the same for untyped and typed properties, so will not be affected by harmonising them. Specifically, followingunset($foo->bar);
a subsequent read of$foo->bar
will call$foo->__get('bar')
if available: https://3v4l.org/D16Tv - Unlike `var_dump($object)`, both `(array)$object` and `serialize($object)` already skip uninitialized properties, so the change in `unset()` behaviour will not affect code using these.
Rejected Features
Changing initial value of properties
When considering the initial value of the property, to be fully consistent, we should do one of two things:
- Initialize any property to
null
if that is a valid value, and it has no other initializer - Never initialize a property to
null
unless that initial value is explicitly specified
Option 1 has the advantage of not causing errors in any currently valid code; but it goes against the general trend of making the language stricter and more explicit.
Option 2 is a non-trivial breaking change; although the edits to be made can be trivially automated (changing code of the form public $foo;
to public $foo = null;
), they will be very widespread. Users may rightly question the value of requiring such an edit.
Allowing ''readonly'' on untyped properties
As currently defined, ''readonly'' properties cannot be written to once they have been initialized, so it does not make sense to allow one with an inline initializer. This also means they are currently required to have a specified type, since without one they are implicitly initialized to null
.
Unfortunately, this restriction must remain if we are keeping the implicit null
initializer.
Future Scope
None identified at present.
Proposed Voting Choices
TODO: Either a straight vote for the whole proposal, or separate proposals for initial state and unset
behaviour, depending on initial feedback.
Patches and Tests
TODO
References
Related RFCs already accepted:
Previous mailing list threads touching on this issue:
Significant revisions
- Version 0.3, 2023-11-16: originally posted for discussion
- Version 0.5, 2023-11-20: changed to keep null initializer as special case; added discussion of equivalence for type variance checks