PHP RFC: Non-Nullable Property Checks
- Version: 0.1
- Date: 2019-01-23
- Author: Rowan Collins [IMSoP], rowan.collins@gmail.com
- Status: Draft
- First Published at: http://wiki.php.net/rfc/non_nullable_property_checks
Introduction
The Typed Properties 2.0 RFC adds support for typed properties to the language, including non-nullable properties. However, failing to assign a value to a non-nullable property is not an error; instead, the property remains unset, and raises an error only when it is accessed. This RFC proposes additional checks to raise the error closer to the cause of the problem in common situations such as incorrectly written constructors.
Specifically, it proposes a validation check be performed at the end of the constructor, and after deserialization, which throws a TypeError if any property has been left in an uninitialized state.
Problem Description
As pointed out by Larry Garfield, there are two ways of considering type annotations on properties:
- a contract asserting that the property will *always be* a value of the given type
- a simpler contract, that the property *can only be set to* the given type
When a property has a static default value, or allows nulls, the distinction rarely matters: in normal use, the property starts life with a valid value, and can only be set to a valid value. However, if null is not a valid value, and there is no static default, the property must start in an invalid state, and it is easy to write code which leaves it in that state.
The stronger contract requires extremely careful language design, as there must be some defined point in the program where the asserted state first becomes true; for instance, in Swift's two-phase initialization, all introduced properties must be in a valid state before the parent initializer is called. This is difficult if not impossible to add to an existing, highly-dynamic, language such as PHP.
The current implementation therefore concentrates mainly on the simpler contract, and accepts uninitialized properties as a necessary evil. Whenever an uninitialized property is read from, an Error is thrown, avoiding the propagation of invalid data; however, this error is likely to be thrown a long way from the cause of the actual bug.
Consider this example:
namespace VendorOne\LowLevelLib { class ImplementationDetail { public string $mode; public int $value; public function __construct(string $mode) { switch ( $mode ) { case 'dev': case 'development': $this->mode = 'dev'; ### The bug is actually here break; case 'prod': case 'production': $this->mode = 'prod'; $this->value = 42; break; } } } } namespace VendorTwo\FrameworkPackage { class UsefulTool { private \VendorOne\LowLevelLib\ImplementationDetail $util; public function __construct() { $this->util = new \VendorOne\LowLevelLib\ImplementationDetail('dev'); ### Proposed TypeError: "Typed property $value must be initialized before end of constructor ... in ImplementationDetail::__construct()" } public function getScore() { return random_int(1, 6) * $this->util->value; } } } namespace EndUser\Application { $tool = new \VendorTwo\FrameworkPackage\UsefulTool; echo $tool->getScore(); ### Current TypeError: "Typed property $value must not be accessed before initialization ... in UsefulTool->getScore()" }
Here, the bug is clearly in the constructor of the low-level library class, in an untested scenario inadvertently used by the library's consumer. But the current error cannot tell the user that, and doesn't even show up until an even later section of the code happens to access the affected property.
The proposed change would see the error reported as soon as the constructor exits, making it much clearer where the problem lies.
Proposal
Two internal functions will be created (names subject to bikeshedding):
zend_check_properties_initialized
which will iterate all the typed properties of an object, and returnfalse
if any are currently uninitializedzend_assert_properties_initialized
which will perform the above check, and raise a TypeError if the result isfalse
The following places will call zend_assert_properties_initialized
, resulting in more user-friendly errors:
- Immediately after an object is constructed (to catch errors in
__construct
) - Immediately after an object is unserialized (to catch errors in
__sleep
/Unserialize
/ __unserialize)
Backward Incompatible Changes
If this change is added before the release of PHP 7.4.0, no existing code will be affected, as previous versions do not support typed properties.
If it is for some reason delayed, there is the possibility that code which runs under PHP 7.4 will start raising errors due to the new checks.
Performance
Although the checks will obviously have some overhead, this is expected to be close to zero for classes with no typed properties (since we can check using ZEND_CLASS_HAS_TYPE_HINTS
), and equivalent to one additional access on each property for classes with them.
Proposed PHP Version
PHP 7.4
Open Issues
- Are there other places that this assertion can be added?
- Should we expose
zend_check_properties_initialized
to userland, so that users can manually assert that an object is fully initialized after creating or manipulating in a way not handled by the automatic checks?
Unaffected PHP Functionality
It is important to understand that this proposal does not guarantee that a typed property will always have a valid value. Among others, the following may still lead to uninitialized properties:
- Calling
unset()
on a non-nullable property. Although normally undesirable, this is used in conjunction with__get
for some exotic use cases. - Creating an object with
ReflectionClass::newWithoutConstructor
.
Proposed Voting Choices
Should checks be added to detect objects which are not fully initialized after common cases such as construction.
This is a change to the behaviour of the language, so requires a 2/3 majority.
Implementation
None yet.
References
Rejected Features
TODO