rfc:default_expression:type_safety

Note: This is a suggestion for a new top-level section to be added to default_expression, prepared by Rowan Tommins [imsop@php.net]

Type safety

Currently, the default values of optional parameters are not part of the signature-contract of a function or method. That is, if an interface or class includes the declaration public function foo(int $bar=42) no constraints are placed on a sub-class other than that $bar must remain optional. It is generally considered safe to vary signatures in a new version of a library functions according to the same rules. This could cause problems for users of default expressions, because operations valid for the parent class or current version may throw a TypeError for a sub-class or future version.

Sub-classes

There are two situations where the type of a default value can vary:

  • When the parameter type already allows multiple concrete types - for instance, a union type, nullable type, or interface. In this case, the caller may intuitively realise that an expression relying on the concrete type of the default is not type safe.
  • When the parameter type allowed by the sub-class is wider than allowed by the parent (“contravariance of input”) - for instance, a single type widened to a union; a type made nullable; or a class/interface widened to a more general class/interface. The sub-class is free to choose any default value from the widened type specification. In this case, it is not obvious to the caller that they have written code which is not type safe.

For example:

interface I {
    public function foo(int $bar=42);
}
 
// The interface guarantees that the method can be called with no parameters, or with a single int
function safe(I $instance) {
    $instance->foo();
    $instance->foo(100);
}
 
// At a glance, it seems safe to assume the default will be an int for any instance of I
function test(I $instance) {
    $instance->foo(default + 1);
}
 
// Changing the value of the default, but not its type, is unlikely to cause a problem
class A implements I {
    public function foo(int $bar=69) {}
}
 
// However, a class can also widen the input type, and choose a new default
class B implements I {
    public function foo(string|int $bar='hello') {}
}
 
safe(new A); // OK
safe(new B); // OK
 
test(new A); // OK
test(new B); // Fatal error: Uncaught TypeError: Unsupported operand types: string + int.

The above example uses a union of scalar types for simplicity, but the same problem can occur with any type widening supported by the language; for instance the base class might specify HttpClient $client=new HttpClient but a sub-class overrides with NetworkClientInterface $client=new WebSocketClient.

From a type theory perspective, the default token is being treated as an output, which should be covariant - the sub-class should be allowed to make it the same or narrower, but not wider.

Changes over time

The same problem also arises when evolving an API over time, where it's common to treat anything that would raise a TypeError in existing callers as a breaking change (e.g. requiring a SemVer major version), but changes only visible in reflection as backward compatible. Currently, that means:

  • widening parameter types is backward compatible for free-standing functions, constructors, and final methods
  • for normal methods, widening the input type would require sub-classes to also widen their input type, but choosing any value within the current parameter type is always backward compatible

For example, PHP 8.4 has changed the signature of the round function:

# PHP 8.3
function round(int|float $num, int $precision = 0, int $mode = PHP_ROUND_HALF_UP): float {}
# PHP 8.4
function round(int|float $num, int $precision = 0, int|RoundingMode $mode = RoundingMode::HalfAwayFromZero): float {}

In this case, it's unlikely anyone would use a default expression with the rounding mode, since the integer values had no particular meaning; but it's easy to imagine a similar change for other functions:

# PHP <= 8.0
function htmlspecialchars(string $string, int $flags = ENT_COMPAT, ?string $encoding = null, bool $double_encode = true): string
# PHP 8.1
function htmlspecialchars(string $string, int $flags = ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401, ?string $encoding = null, bool $double_encode = true): string
# Hypothetical Future 8.x
function htmlspecialchars(string $string, int|HtmlEntityOptions $flags = new HtmlEntityOptions, ?string $encoding = null, bool $double_encode = true): string
 
# Calling this seems sensible with the current (and previous) signature, but would break with the future 8.x change
echo htmlspecialchars($something, default | ENT_HTML5);

As well as introducing new functionality, widening can be used to phase out a parameter, by switching to a null default:

# Version 1.0
class Foo {
  public function __construct(string $filepath='/var/run/foo/cache') {
    // ...
  }
}
 
# Version 1.1
class Foo {
  public function __construct(?string $filepath=null) {
    if ( $filepath !== null ) {
      trigger_error('Setting a custom file path is deprecated, and will be removed in a future version.', E_USER_DEPRECATED);
    }
    // ...
  }
}
 
// Looks reasonable in version 1.0, will have a very unexpected result in 1.1
$foo = new Foo(default . '_test');

Possible approaches

There are three basic ways to handle the scenarios outlined in this section:

  1. Allow any expression using default, and document the caveat that such expressions are not type-safe, and may throw a TypeError or behave unexpectedly. If the feature is only rarely used, problems will be encountered even more rarely.
  2. Change the contract of methods to state that default values are covariant, rather than contravariant. To be precise, that the default value of a parameter in a sub-class must be of the same concrete type as the default value in the parent, or a sub-type of it, regardless of the parameter's declared type. This would invalidate a non-trivial amount of existing code, so would need to be introduced in a major version, probably with a deprecation period.
  3. Limit the expressions allowed to those where default can be safely substituted for any type (i.e. where it can be analysed as of type mixed). In practice, this means conditional expressions where default is used unchanged as the result, such as $condition ? 'explicit value' : default. This probably requires a different implementation, where default is compiled as a placeholder value which triggers the function call to act as though no value was passed.
  4. Limit use further, to only using default as a stand-alone token to skip a parameter. This is essentially what was proposed and declined in Stas Malyshev's "Skipping optional parameters for functions" RFC.

This RFC currently proposes option 1.

rfc/default_expression/type_safety.txt · Last modified: 2024/09/02 19:57 by imsop