rfc:named_params

This is an old revision of the document!


PHP RFC: Named Arguments

Introduction

Named arguments allow passing arguments to a function based on the parameter name, rather than the parameter position. This makes the meaning of the argument self-documenting, makes the arguments order-independent, and allows skipping default values arbitrarily.

To give a simple example:

// Using positional arguments:
array_fill(0, 100, 50);
 
// Using named arguments:
array_fill(start_index: 0, num: 100, value: 50);

The order in which the named arguments are passed does not matter. The above example passes them in the same order as they are declared in the function signature, but any other order is possible too:

array_fill(value: 50, num: 100, start_index: 0);

It is possible to combine named arguments with normal, positional arguments and it is also possible to specify only some of the optional arguments of a function, regardless of their order:

htmlspecialchars($string, double_encode: false);
// Same as
htmlspecialchars($string, ENT_COMPAT | ENT_HTML401, 'UTF-8', false);

What are the benefits of named arguments?

Skipping defaults

One obvious benefit of named arguments can be seen in the last code sample (using htmlspecialchars): You no longer have to specify all defaults until the one you want to change. Named arguments allow you to directly overwrite only those defaults that you wish to change.

This is also possible with the skipparams RFC, but named arguments make the intended behavior clearer. Compare:

htmlspecialchars($string, default, default, false);
// vs
htmlspecialchars($string, double_encode: false);

Seeing the first line you will not know what the false argument does (unless you happen to know the htmlspecialchars signature by heart), whereas the double_encode: false variant makes the intention clear.

Self-documenting code

The benefit of making code self-documenting applies even when you are not skipping optional arguments. For example, compare the following two lines:

array_slice($array, $offset, $length, true);
// vs
array_slice($array, $offset, $length, preserve_keys: true);

If I wasn't writing this example right now, I would not know what the fourth parameter of array_slice does (or even that it exists in the first place).

Object Initialization

The Constructor Property Promotion RFC makes it a lot simpler to declare classes for value objects. To pick one of the examples from that RFC:

// Part of PHP AST representation
class ParamNode extends Node {
    public function __construct(
        public string $name,
        public ExprNode $default = null,
        public TypeNode $type = null,
        public bool $byRef = false,
        public bool $variadic = false,
        Location $startLoc = null,
        Location $endLoc = null,
    ) {
        parent::__construct($startLoc, $endLoc);
    }
}

Constructors in particular often have a larger than average number of parameters whose order has no particular significance, and which are commonly defaulted. While constructor promotion makes the class declaration simple, it does not help the actual object instantiation.

There have been multiple attempts to make object construction more ergonomic, such as the Object Initializer RFC and the COPA RFC. However, all such attempts have been declined, as they do not integrate well into the language, due to unfavorable interaction with constructors or non-public properties.

Named arguments solve the object initialization problem as a side-effect, in a way that integrates well with existing language semantics.

new ParamNode("test", null, null, false, true);
// becomes:
new ParamNode("test", variadic: true);
 
new ParamNode($name, null, null, $isVariadic, $passByRef);
// or was it?
new ParamNode($name, null, null, $passByRef, $isVariadic);
// becomes
new ParamNode($name, variadic: $isVariadic, byRef: $passByRef);
// or
new ParamNode($name, byRef: $passByRef, variadic: $isVariadic);
// and it no longer matters!

The benefit of named arguments for object initialization is on the surface the same as for other functions, it just tends to matter more in practice here.

Type-safe and documented options

One of the common workarounds for the lack of named arguments, is the use of an options array. The previous example could be rewritten to use an options array as follows:

class ParamNode extends Node {
    public string $name;
    public ExprNode $default;
    public TypeNode $type;
    public bool $byRef;
    public bool $variadic;
 
    public function __construct(string $name, array $options = []) {
        $this->name = $name;
        $this->default = $options['default'] ?? null;
        $this->type = $options['type'] ?? null;
        $this->byRef = $options['byRef'] ?? false;
        $this->variadic = $options['variadic'] ?? false;
 
        parent::__construct(
            $options['startLoc'] ?? null,
            $options['endLoc'] ?? null
        );
    }
}
 
// Usage:
new ParamNode($name, ['variadic' => true]);
new ParamNode($name, ['variadic' => $isVariadic, 'byRef' => $passByRef]);

While this works, and is already possible today, it has a quite a range of disadvantages:

  • For constructors in particular, it precludes usage of constructor promotion.
  • The available options are not documented in the signature. You have to look at the implementation or phpdoc to find out what is supported and what types it requires. Phpdoc also provides no universally recognized way to document this.
  • The type of the option values is not validated unless manually implemented. In the above example, the types will actually be validated due to the use of property types, but this will not follow usual PHP semantics (e.g. if the class declaration uses strict_types, the options will also be validated according to strict_types).
  • Unless you go out of your way to protect against this, passing of unknown options will silently succeed.
  • Use of an options array requires a specific decision at the time the API is introduced. If you start off without one, but then add additional optional parameters and realize that using an options array would be cleaner, you cannot perform the switch without breaking existing API users.

Named parameters provide the same functionality as options arrays, without any of the disadvantages.

Attributes

The use of named arguments in phpdoc annotations is already wide-spread in the ecosystem. While the Attributes RFC replaces phpdoc annotations with a first-class language feature, it does not provide support for named arguments. This means that existing annotations will have to introduce significant structural changes to migrate to the attribute system.

For example, the Symfony Route annotation accepts a number of optional options such as methods. Currently, a migration to attributes might look like this:

/**
 * @Route("/api/posts/{id}", methods={"GET","HEAD"})
 */
public function show(int $id) { ... }
 
// Might become:
 
<<Route("/api/posts/{id}", ["methods" => ["GET", "HEAD"]])>>
public function show(int $id) { ... }

Introducing named arguments in the same version as attributes would allow retaining exactly the same structure as before:

<<Route("/api/posts/{id}", methods: ["GET", "HEAD"])>>
public function show(int $id) { ... }

Some changes would still be necessary due to the lack of support for nested annotations, but this would make the migration a good bit smoother.

Proposal

Syntax

Named arguments are passed by prefixing the value with the parameter name followed by a colon:

callAFunction(paramName: $value);

It is possible to use reserved keywords as the parameter name:

array_foobar(array: $value);

The parameter name must be an identifier, it's not possible to specify it dynamically:

// NOT supported.
function_name($variableStoringParamName: $value);

This syntax is not supported to ensure that there are no perceived ambiguities with constant names. However, a different way to specify the parameter name dynamically is provided in the argument unpacking section.

Some syntax alternatives that are technically feasible are:

function_name(paramName: $value);    // (1) as proposed
function_name(paramName => $value);  // (2)
function_name(paramName = $value);   // (3)
function_name(paramName=$value);     // (3) formatting variation
function_name($paramName: $value);   // (4)
function_name($paramName => $value); // (5)

It should be noted that the following syntax is not possible, because it already constitutes legal code:

function_name($paramName = $value);

A previous version of this RFC proposed => (variant 2) as the named arguments syntax. However, practical usage has found this to be rather noisy and non-ergonomic. See the future scope section for some additional syntax considerations, and why : might be a good choice.

Constraints

It is possible to use positional and named arguments in the same call, however the named arguments must come after the positional arguments:

// Legal
test($foo, param: $bar);
// Compile-time error
test(param: $bar, $foo);

Passing the same parameter multiple times results in an Error exception:

function test($param) { ... }
 
// Error: Named parameter $param overwrites previous argument
test(param: 1, param: 2);
// Error: Named parameter $param overwrites previous argument
test(1, param: 2);

The first case is trivially illegal, because it specifies the same named argument twice. The second case is also illegal, because the positional argument and the named argument refer to the same parameter.

With the exception of variadic functions discussed below, specifying an unknown parameter name results in an Error exception:

function test($param) { ... }
 
// Error: Unknown named parameter $parma
test(parma: "Oops, a typo");

Variadic functions and argument unpacking

Functions declared as variadic using the ...$args syntax will also collect unknown named arguments into $args. The unknown named arguments will always follow after any positional arguments and will be in the order in which they were passed.

function test(...$args) { var_dump($args); }
 
test(1, 2, 3, a: 'a', b: 'b');
// [1, 2, 3, "a" => "a", "b" => "b"]

The foo(...$args) unpacking syntax from the argument unpacking RFC also supports unpacking named arguments:

$params = ['start_index' => 0, 'num' => 100, 'value' => 50];
array_fill(...$params);

Any value with a string key is unpacked as a named argument. Integers keys are treated as normal positional arguments (with the integer value being ignored). Keys that are neither integers or strings (only possible for iterators) result in a TypeError.

Argument unpacking is also subject to the general rule that positional arguments must always precede named arguments. Both of the following calls throw an Error exception:

array_fill(...['start_index' => 0, 100, 50]);
array_fill(start_index: 0, ...[100, 50]);

Furthermore, unpacking is subject to the usual limitation that no positional or named arguments may follow the unpack:

test(...$values, $value); // Compile-time error (as before)
test(...$values, paramName: $value); // Compile-time error

One of the primary use-cases for that variadic/unpacking syntaxes is forwarding of arguments:

function passthru(callable $c, ...$args) {
    return $c(...$args);
}

The support for named arguments in both variadics and argument unpacking ensures that this pattern will continue to work once named arguments are introduced.

func_* and call_user_func_array

The func_*() family of functions is intended to be mostly transparent with regard to named arguments, by treating the arguments as if were all passed positionally, and missing arguments were replaced with their defaults. For example:

function test($a = 0, $b = 1, $c = 2) {
    var_dump(func_get_args());
}
 
test(c: 5);
// Will behave exactly the same as:
test(0, 1, 5);
// Which is:
// array(3) { [0] => 0, [1] => 1, [2] => 5 }

The behavior of func_num_args() and func_get_arg() is consistent with that of func_get_args().

All three functions are oblivious to the collection of unknown named arguments by variadics. func_get_args() will not return the collected values and func_num_args() will not include them in the argument count.

The call_user_func_array function will continue behaving exactly as is: It currently treats the passed arguments as positional (regardless of whether they have string keys), and will continue to do so. (This is unlike the argument unpacking syntax, which was designed with named argument forward compatibility in mind: It currently throws for string keys.)

The general philosophy here is that func_get_args() and call_user_func_array() are legacy functionality that has been obsoleted by variadic arguments and argument unpacking. Changing their behavior is likely to cause more breakage than benefit. (We should begin making plans to phase out these functions.)

__call()

Unlike __invoke(), the __call() and __callStatic() magic methods do not specify a proper method signature, so we cannot differentiate behavior based on whether the method uses variadics or not. To permit maximum functionality, __call() will collect unknown named parameters into the $args array, just like it happens for variadics:

class Proxy {
    public function __construct(
        private object $object,
    ) {}
    public function __call(string $name, array $args) {
        // $name == "someMethod"
        // $args == [1, "paramName" => 2];
        $this->object->$name(...$args);
    }
}
 
$proxy = new Proxy(new FooBar);
$proxy->someMethod(1, paramName: 2);

Parameter name changes during inheritance

Currently, parameter names are not part of the signature-contract. When only positional arguments are used, this is quite reasonable: The name of the parameter is irrelevant to the caller. Named arguments change this. If an inheriting class changes a parameter name, calls using named arguments might fail, thus violating the Liskov substitution principle (LSP):

interface A {
	public function test($foo, $bar);
}
 
class B implements A {
	public function test($a, $b) {}
}
 
$obj = new B;
 
// Pass params according to A::test() contract
$obj->test(foo: "foo", bar: "bar"); // ERROR!

This mail contains a detailed analysis of how this issue is handled by different languages. To summarize the different observed behaviors:

  • Python and Ruby allow parameter name changes silently, and throw an error during the call.
  • C# and Swift introduce a new overload (or error if override is requested). As PHP does not support method overloading, this is not an option for us.
  • Kotlin warns on parameter name change and errors on call.

The stance of this RFC is that PHP should silently accept parameter name changes, at least for the initial release of this feature. Instead we should encourage static analyzers and IDEs to diagnose parameter name mismatches, with appropriate suppression facilities.

The reason is simple: PHP has a very large body of existing code that was not designed with named arguments in mind, does not use named arguments and will likely never use named arguments in non-trivial capacity. While throwing a warning/notice would be more theoretically pure, it is not particularly pragmatic for the migration of existing code. Taking into account that the poster-child of named argument usage (Python) does not make such an enforcement either, I think it is fine to hold off on diagnostics in this area.

Internal functions

Historically, internal functions did not have a well-defined concept of a parameter “default value”. While they specify which parameters are optional, the actual default value is determined by the implementation and not available for introspection.

Since PHP 8.0, it is possible to specify reflectible default values for internal functions, and this has already happened for functions which are bundled with the PHP distribution. This proposal is based on this default value information: Skipped parameters will be replaced by their default value before the internal implementation of the function is invoked.

However, it is not possible to specify a sensible notion of “default value” for all parameters. For example:

function array_keys(array $arg, $search_value = UNKNOWN, bool $strict = false): array {}

The array_keys() function has fundamentally different behavior depending on whether $search_value is passed. There exists no value that can be passed as $search_value, which will exhibit the same behavior as not passing the parameter. Such parameters are denoted as UNKNOWN in stubs.

Skipping such a parameter will result in an Error exception being thrown.

// This is okay.
array_keys($array, search_value: 42, strict: true);
 
// Error: Argument #2 ($search_value) must be passed explicitly,
//        because the default value is not known
array_keys($array, strict: true);

I believe this is exactly the behavior we want, as specifying $strict without $search_value does not make sense.

The disadvantage of this general approach is that it requires default value information to be provided in order to work. 3rd-party extensions that do not provide this information (yet), will work with named arguments, but will not support skipping of arguments.

The alternative, which has been pursued by a previous version of this proposal, is to leave UNDEF values on the stack and let them be interpreted appropriately by the internal parameter parsing mechanism (ZPP). This means that many cases will “just work”, but some cases, especially those containing explicit argument counts checks (ZEND_NUM_ARGS()), may not just misbehave, but result in memory unsafety and crashes.

Documentation / Implementation mismatches

Currently, the parameter names used in the documentation and the implementation do not always match. If this proposal is accepted, we will synchronize the parameter names between both. This will also involve creating some naming guidelines, such as on the use of casing in parameter names.

Backwards incompatible changes

In the narrow sense, this proposal has no backwards-incompatible changes, in that the behavior of existing code remains completely unchanged.

However, there are two primary complications that may occur when named arguments are used with code that is not prepared to deal with them:

First, as parameter names are now significant, they should not be changed during inheritance. Existing code that performs such changes may be practically incompatible with named arguments. More generally, greater care needs to be taken when choosing parameter names, as they are now part of the API contract.

Second, code may not be prepared to deal with unknown named arguments collected into variadics. In most cases this will manifest with the parameter names simply being ignored, which is mostly harmless.

Alternatives

There are two primary alternative implementation approaches for named arguments that I'm aware of, which will be briefly discussed in the following.

First, to make named arguments opt-in. The current RFC allows all functions/methods to be invoked using named arguments. Requiring an explicit opt-in through a keyword or attribute would nicely side-step the problem of parameter name changes, as we could enforce those only for functions that opt-in to named arguments.

The big disadvantage of the opt-in approach is, of course, that named arguments would not work with any existing code (both userland and internal). I think that this would be a big loss to the feature, to the point that it might no longer be worthwhile. In particular, this would lose out on the object initialization use-case (as the syntax would not be usable in most cases), and would not help with old APIs, which tend to be particularly bad offenders when it comes to having many defaulted parameters and boolean flags.

I think it would be more fruitful to provide an explicit opt-out mechanism, such as a <<NoNamedArgs>> attribute, for APIs that explicitly do not wish to support named arguments, and the API burden that comes with it. (A possible example is the ArrayAccess interface, which is almost never invoked directly, and for which it is particularly common to change the parameter names for each implementer.)

Second, implementing named arguments as a side-effect of improved array destructuring functionality. As an example, let's return to the ParamNode with $options array example from earlier, and rewrite it to use array destructuring:

class ParamNode extends Node {
    public string $name;
    public ExprNode $default;
    public TypeNode $type;
    public bool $byRef;
    public bool $variadic;
 
    public function __construct(string $name, array $options) {
        [
            "default" => ExprNode $default = null,
            "type" => TypeNode $type = null,
            "byRef" => bool $type = false,
            "variadic" => bool $variadic = false,
            "startLoc" => Location $startLoc = null,
            "endLoc" => Location $endLoc = null,
        ] = $options;
 
        $this->name = $name;
        $this->default = $default;
        $this->type = $type;
        $this->byRef = $byRef;
        $this->variadic = $variadic;
        parent::__construct($startLoc, $endLoc);
    }
}

This uses the existing syntax for array destructuring with keys, but additionally assumes support for destructuring default values, as well as destructuring type checks. As an additional step, we could support destructuring directly in the function signature:

class ParamNode extends Node {
    public string $name;
    public ExprNode $default;
    public TypeNode $type;
    public bool $byRef;
    public bool $variadic;
 
    public function __construct(
        string $name,
        array [
            "default" => ExprNode $default = null,
            "type" => TypeNode $type = null,
            "byRef" => bool $type = false,
            "variadic" => bool $variadic = false,
            "startLoc" => Location $startLoc = null,
            "endLoc" => Location $endLoc = null,
        ],
    ) {
        $this->name = $name;
        $this->default = $default;
        $this->type = $type;
        $this->byRef = $byRef;
        $this->variadic = $variadic;
        parent::__construct($startLoc, $endLoc);
    }
}

While I think that improvements to array destructuring are worth pursuing, I don't think this covers the named parameter use-case satisfactorily. While this does take care of the type-safety concern, it still requires APIs to be specifically designed around an options array.

Additionally, this does not solve the problem of unknown options being silently accepted (though this could be part of a new infallible pattern matching syntax), and of unclear interaction with features like strict_types.

Future Scope

Shorthand syntax for matching parameter and variable name

Especially for constructors, one of the common use-cases is to assign local variables to parameters with the same name, for example:

new ParamNode(
    name: $name,
    type: $type,
    default: $default,
    variadic: $variadic,
    byRef: $byRef
);

Some languages offer special syntax (both for object initialization and destructuring) to avoid repeating the same name twice. Here is how such a syntax could look like in PHP, depending on the chosen named arguments syntax:

new ParamNode(:$name, :$type, :$default, :$variadic, :$byRef);
new ParamNode(=$name, =$type, =$default, =$variadic, =$byRef);
new ParamNode(=> $name, => $type, => $default, => $variadic, => $byRef);

It should be noted that this problem is not specific to named arguments, and also affects array destructuring:

// What you have to write right now:
['x' => $x, 'y' => $y, 'z' => $z] = $point;

Analogously to the above examples, this could be written as:

[:$x, :$y, :$z] = $point;
[=$x, =$y, =$z] = $point;
[=> $x, => $y, => $z] = $point;

Finally, this could also be useful for array construction, obsoleteing the compact() magic function and making code more analyzable:

return compact('x', 'y', 'z');
 
// Could become:
return [:$x, :$y, :$z];
return [=$x, =$y, =$z];
return [=> $x, => $y, => $z];

If I wanted to put these ideas into a general framework, I think one way to go about this would be as follows:

  • Consider identifier: $expr as a shorthand for "identifier" => $expr.
  • Consider :$variable as a shorthand for variable: $variable and thus "variable" => $variable.

Under this proposal, all three of the following would behave identically:

$point = ['x' => $x, 'y' => $y, 'z' => $z];
$point = [x: $x, y: $y, z: $z];
$point = [:$x, :$y, :$z];

Approaching from this angle, the named argument syntax we should use is paramName: $value, or :$paramName for short.

Changelog

  • 2020-06-23: Remove syntax as open question, specify use of :.
  • 2020-05-05: RFC picked up again for PHP 8.0.
  • 2013-09-09: func_get_arg(s) now return default values on skipped parameters.
rfc/named_params.1592902567.txt.gz · Last modified: 2020/06/23 08:56 by nikic