rfc:named_params

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
rfc:named_params [2020/06/23 09:58] nikicrfc:named_params [2022/05/26 15:31] (current) – Revert previous accidental change iquito
Line 2: Line 2:
   * Date: 2013-09-06, significantly updated 2020-05-05   * Date: 2013-09-06, significantly updated 2020-05-05
   * Author: Nikita Popov <nikic@php.net>   * Author: Nikita Popov <nikic@php.net>
-  * Status: Under Discussion+  * Status: Implemented
   * Target Version: PHP 8.0   * Target Version: PHP 8.0
   * Implementation: https://github.com/php/php-src/pull/5357   * Implementation: https://github.com/php/php-src/pull/5357
Line 197: Line 197:
 </PHP> </PHP>
  
-This syntax is not supported to ensure that there are no perceived ambiguities with constant names. However, a different way to specify the parameter name dynamically is provided in the argument unpacking section.+This syntax is not supported, because it would create an ambiguity: Is ''function_name(FOO: $value)'' a simple named argument use, or does it intend to use the value of the ''FOO'' constant as the parameter name? However, a different way to specify the parameter name dynamically is provided in the argument unpacking section.
  
 Some syntax alternatives that are technically feasible are: Some syntax alternatives that are technically feasible are:
Line 295: Line 295:
 The support for named arguments in both variadics and argument unpacking ensures that this pattern will continue to work once named arguments are introduced. The support for named arguments in both variadics and argument unpacking ensures that this pattern will continue to work once named arguments are introduced.
  
-==== func_* and call_user_func_array ====+==== func_get_args() and friends ====
  
-The ''func_*()'' family of functions is intended to be mostly transparent with regard to named arguments, by treating the arguments as if were all passed positionally, and missing arguments were replaced with their defaults. For example:+The ''func_*()'' family of functions is intended to be mostly transparent with regard to named arguments, by treating the arguments as if they were all passed positionally, and missing arguments were replaced with their defaults. For example:
  
 <PHP> <PHP>
Line 313: Line 313:
 The behavior of ''func_num_args()'' and ''func_get_arg()'' is consistent with that of ''func_get_args()''. The behavior of ''func_num_args()'' and ''func_get_arg()'' is consistent with that of ''func_get_args()''.
  
-All three functions are oblivious to the collection of unknown named arguments by variadics. ''func_get_args()'' will not return the collected values and ''func_num_args()'' will not include them in the argument count. +All three functions are oblivious to the collection of unknown named arguments by variadics. ''func_get_args()'' will not return the collected values and ''func_num_args()'' will not include them in the argument count. Collected unknown named arguments can only be accessed through the variadic parameter.
-   +
-The ''call_user_func_array'' function will continue behaving exactly as is: It currently treats the passed arguments as positional (regardless of whether they have string keys), and will continue to do so. (This is unlike the argument unpacking syntax, which was designed with named argument forward compatibility in mind: It currently throws for string keys.)+
  
-The general philosophy here is that ''func_get_args()'' and ''call_user_func_array()'' are legacy functionality that has been obsoleted by variadic arguments and argument unpackingChanging their behavior is likely to cause more breakage than benefit. (We should begin making plans to phase out these functions.)+==== call_user_func() and friends ==== 
 + 
 +Internal functions that perform some kind of "call forwarding", including ''call_user_func()'' and ''call_user_func_array()'' support named arguments: 
 + 
 +<PHP> 
 + 
 +$func = function($a = '', $b = '', $c = '') { 
 +    echo "a: $a, b: $b, c: $c\n"; 
 +
 + 
 +// All of the following behave the same: 
 +$func('x', c: 'y'); 
 +call_user_func($func, 'x', c: 'y'); 
 +call_user_func_array($func, ['x', 'c' => 'y']); 
 +</PHP> 
 + 
 +These calls are subject to the same restrictions as normal, for example there may not be positional arguments after named arguments. 
 + 
 +For ''call_user_func_array()'', this behavior constitutes a minor backwards-compatibility break: Previously, array keys were completely ignored by this functionNow, string keys will be interpreted as parameter names. 
 + 
 +While ''call_user_func(_array)'' are the "base cases", this support also extends to other similar functions, such as ''ReflectionClass::newInstance()'' and ''ReflectionClass::newInstanceArgs()''.
  
 ==== __call() ==== ==== __call() ====
Line 338: Line 356:
 $proxy->someMethod(1, paramName: 2); $proxy->someMethod(1, paramName: 2);
 </PHP> </PHP>
 +
 +==== Attributes ====
 +
 +Attributes also support named arguments:
 +
 +<PHP>
 +<<MyAttribute('A', b: 'B')>>
 +class Test {}
 +</PHP>
 +
 +Similar to normal calls, trying to pass positional arguments after named arguments results in a compile-time error. Additionally, using the same parameter name twice results in a compile-time error.
 +
 +The ''ReflectionAttribute::getArguments()'' method returns positional and named arguments in the same format as variadics do:
 +
 +<PHP>
 +var_dump($attr->getArguments());
 +// array(2) {
 +//   [0]=>
 +//   string(1) "A"
 +//   ["b"]=>
 +//   string(1) "B"
 +// }
 +</PHP>
 +
 +The ''ReflectionAttribute::newInstance()'' method will invoke the constructor with named arguments following the rules of ordinary calls.
  
 ==== Parameter name changes during inheritance ==== ==== Parameter name changes during inheritance ====
Line 364: Line 407:
   * Kotlin warns on parameter name change and errors on call.   * Kotlin warns on parameter name change and errors on call.
  
-Because we are retrofitting named arguments to an old language with a large body of existing code, we do not consider it sensible to unconditionally diagnose parameter name mismatches, especially considering that a lot of old code will never be invoked using named arguments. This RFC proposes two possible approaches to handle this issue, which are described in the following.+Because we are retrofitting named arguments to an old language with a large body of existing code, we do not consider it sensible to unconditionally diagnose parameter name mismatches, especially considering that a lot of old code will never be invoked using named arguments.
  
-=== Silently allow parameter name changes ===+This RFC proposes to follow the model of Python or Ruby: PHP will silently accept parameter name changes during inheritance, which may result in call-time exceptions when methods with renamed parameters are called. Static analyzers and IDEs are encouraged to diagnose parameter name mismatches (with appropriate suppression facilities).
  
-The first option is to follow the model of Python or Ruby: PHP will silently accept parameter name changes during inheritance, which may result in call-time exceptions when methods with renamed parameters are called. Static analyzers and IDEs are encouraged to diagnose parameter name mismatches, with appropriate suppression facilities. +This is a pragmatic approach that acknowledges that named arguments are not relevant for many methods, and renamed parameters will usually not become a problem in practice. There is no conceivable reason why a method such as ''offsetGet()'' would be called with named parameters, and there is thus no benefit in requiring ''offsetGet()'' implementors to use the same parameter name.
- +
-This is a pragmatic approach that acknowledges that named arguments are not relevant for many methods, and renamed parameters will usually not become a problem in practice. There is no conveicable reason why a method such as ''offsetGet()'' would be called with named parameters, and there is thus no benefit in requiring ''offsetGet()'' implementors to use the same parameter name.+
  
 As previously mentioned, this approach is also used by some existing languages, most notably Python, which is one of the languages with the heaviest usage of named arguments. This is hard evidence that such an approach does work reasonably well in practice, though of course the situations are somewhat different. As previously mentioned, this approach is also used by some existing languages, most notably Python, which is one of the languages with the heaviest usage of named arguments. This is hard evidence that such an approach does work reasonably well in practice, though of course the situations are somewhat different.
  
-=== Allow using parameter names from parent methods === +The [[#to_parameter_name_changes_during_inheritance|alternatives section]] describes a possible alternative that is not pursued by this RFCbut could be added later on if we felt strong need.
- +
-The alternative is to allow using parameter names from parent methods, as the following example illustrates: +
- +
-<PHP> +
-interface I { +
-    public function test($foo, $bar); +
-+
- +
-class C implements I { +
-    public function test($a, $b) {} +
-+
- +
-$obj = new C; +
- +
-// Pass params according to C::test() contract +
-$obj->test(a: "foo", b: "bar");     // Works! +
-// Pass params according to I::test() contract +
-$obj->test(foo: "foo", bar: "bar"); // Also works! +
-</PHP> +
- +
-Here using ''foo'' and ''bar'' as parameter names is allowed, and will be interpreted as ''a'' and ''b'', because there is a parent method using those names. This makes the methods artificially and automatically LSP compatible. +
- +
-Names from parent methods are registered as aliases, but not bound to a specific signature. As such, it'possible (though not recommended) to mix parameter names from different signatures: +
- +
-<PHP> +
-// Use parameter names from both C::test() and I::test() +
-$obj->test(a: "foo", bar: "bar"); // Also works. +
-</PHP> +
- +
-From a design perspective it would be better to forbid such calls, but I don't believe that it worth the technical and performance cost this would entail. +
- +
-There is one problem with this scheme: What happens if two signatures share the same name at different positions? +
- +
-<PHP> +
-interface I { +
-    public function test($foo, $bar); +
-+
- +
-class C implements I { +
-    public function test($bar, $foo) {} +
-+
- +
-// Fatal error: Parameter $foo of C::test() at position #2 conflicts with +
-//              parameter $foo of I::test() at position #1 +
-</PHP> +
- +
-In this case, the LSP inheritance checks will report a fatal error. It is expected that this restriction will have much less impact in practice than a blanket prohibition of parameter renames, and that it will mostly point out legitimate LSP violations that hold even in the absence of named arguments. An analysis affected cases in the top 2k composer packages can be found at https://gist.github.com/nikic/6cc9891381a83b8dca5ebdaef1068f4d. (It should be noted that the analysis is not fully accurate and may have false negatives.) +
- +
-Parameter names from prototype methods can come from a number of sources: +
- +
-  * Parent methods, including grand parents. +
-  * Interface methods, including implementations of the same method from multiple interfaces. +
-  * Abstract trait methods. +
- +
-As such, a single parameter can have a potentially large number of aliases from a large number of prototypes. +
- +
-A case that requires special consideration are parameters that are absorbed by a variadic in a child class: +
- +
-<PHP> +
-class A { +
-    public function method($a) {} +
-+
-class B extends A { +
-    public function method(...$args) {} +
-+
-class C extends B { +
-    public function method($c = null, ...$args) {} +
-+
- +
-(new B)->method(a: 42); +
-(new C)->method(a: 42); +
-</PHP> +
- +
-There are principally two ways in which this might behave: +
- +
-<code> +
-// Option A: +
-(new B)->method(a: 42); // $args = [42] +
-(new C)->method(a: 42); // $c = 42$args = [] +
- +
-// Option B: +
-(new B)->method(a: 42); // $args = ['a' => 42] +
-(new C)->method(a: 42); // $c = null, $args = ['a' => 42] +
-</code> +
- +
-With option A, we would remember that ''a'' was the first parameter of a parent method, and as such store it at position 0 rather than name ''"a"'' in the variadic parameter. Consequently, in the ''C'' class, the parameter ''a'' would be considered an alias of ''c''+
- +
-With option B, we instead discard parent parameters that are absorbed in variadic. This means that the parameter ''a'' will be stored under the name ''"a"'' in the variadic parameter for both classes ''B'' and ''C''+
- +
-This RFC proposed to use option B to avoid further special-casing of variadic argument collection. +
- +
-Overall, while I think this approach to the LSP problem is conceptually elegant, it turns out that it involves quite a few language design edge cases, as well as technical complexity. I'm not convinced that this is worthwhile, especially as it solves a problem that is largely theoretical at this point.+
  
 ==== Internal functions ==== ==== Internal functions ====
Line 502: Line 451:
  
 Currently, the parameter names used in the documentation and the implementation do not always match. If this proposal is accepted, we will synchronize the parameter names between both. This will also involve creating some naming guidelines, such as on the use of casing in parameter names. Currently, the parameter names used in the documentation and the implementation do not always match. If this proposal is accepted, we will synchronize the parameter names between both. This will also involve creating some naming guidelines, such as on the use of casing in parameter names.
 +
 +=== Internal APIs ===
 +
 +As outlined above, the existence of named arguments is mostly transparent for internal functions. Internal functions will see ordinary positional arguments, without any indication that the original call occurred via named arguments. As such, code adjustments will usually not be necessary.
 +
 +One special case to consider are variadic functions, which will collect unknown named parameters into the ''extra_named_params'' field in the call ''execute_data'' and set the ''ZEND_CALL_HAS_EXTRA_NAMED_PARAMS'' call info flag. On the assumption that most existing internal functions will not be able to do anything useful with this information, functions using the ZPP ''*'' or ''+'' specifiers, or the ''Z_PARAM_VARIADIC'' and ''Z_PARAM_VARIADIC_EX'' macros will automatically throw an ''ArgumentCountError'' if extra unknown named arguments are encountered.
 +
 +<PHP>
 +array_merge([1, 2], a: [3, 4]);
 +// ArgumentCountError: array_merge() does not accept unknown named parameters
 +</PHP>
 +
 +Functions that do want to accept extra unknown named arguments should use the ''Z_PARAM_VARIADIC_WITH_NAMED'' FastZPP macro instead:
 +
 +<code>
 +zval *args;
 +uint32_t num_args,
 +HashTable *extra_named;
 +ZEND_PARSE_PARAMETERS_START(0, -1)
 +    Z_PARAM_VARIADIC_WITH_NAMED(args, num_args, extra_named)
 +ZEND_PARSE_PARAMETERS_END();
 +</code>
 +
 +The ''zend_call_function()'' mechanism is extended to support calls with named parameters by adding a new field into the ''zend_fcall_info'' structure:
 +
 +<code>
 +typedef struct _zend_fcall_info {
 +    /* ... */
 +    HashTable *named_params;
 +} zend_fcall_info;
 +</code>
 +
 +Code that manually initializes ''zend_fcall_info'' structures, instead of going through supported initialization functions, should take care to initialize this field to ''NULL'' if it is unused.
 +
 +For convenience of implementation for ''call_user_func_array()'' style functions, ''named_params'' may also contain positional arguments, that will be appended to the normal ''params''. As usual, ordering positional arguments after named ones in the array will result in an exception.
  
 ===== Backwards incompatible changes ===== ===== Backwards incompatible changes =====
  
-In the narrow sense, this proposal has no backwards-incompatible changes, in that the behavior of existing code remains completely unchanged.+In the narrow sense, this proposal has only one backwards-incompatible change: String keys in the ''call_user_func_array()'' arguments will now be interpreted as parameter names, instead of being silently ignored.
  
-However, there are two primary complications that may occur when named arguments are used with code that is not prepared to deal with them:+Next to this actual incompatibility, there are also two potential complications that may occur when named arguments are used with code that is not prepared to deal with them:
  
 First, as parameter names are now significant, they should not be changed during inheritance. Existing code that performs such changes may be practically incompatible with named arguments. More generally, greater care needs to be taken when choosing parameter names, as they are now part of the API contract. First, as parameter names are now significant, they should not be changed during inheritance. Existing code that performs such changes may be practically incompatible with named arguments. More generally, greater care needs to be taken when choosing parameter names, as they are now part of the API contract.
Line 514: Line 498:
  
 ===== Alternatives ===== ===== Alternatives =====
 +
 +==== To named arguments ====
  
 There are two primary alternative implementation approaches for named arguments that I'm aware of, which will be briefly discussed in the following. There are two primary alternative implementation approaches for named arguments that I'm aware of, which will be briefly discussed in the following.
Line 587: Line 573:
  
 Additionally, this does not solve the problem of unknown options being silently accepted (though this could be part of a new infallible pattern matching syntax), and of unclear interaction with features like ''strict_types''. Additionally, this does not solve the problem of unknown options being silently accepted (though this could be part of a new infallible pattern matching syntax), and of unclear interaction with features like ''strict_types''.
 +
 +==== To parameter name changes during inheritance ====
 +
 +This RFC proposes to silently allow parameter name changes during inheritance. This is pragmatic, but may result in call-site errors when parameter names are changed and methods are invoked on child objects. An alternative is to automagically allow using parameter names from parent methods, as the following example illustrates:
 +
 +<PHP>
 +interface I {
 +    public function test($foo, $bar);
 +}
 +
 +class C implements I {
 +    public function test($a, $b) {}
 +}
 +
 +$obj = new C;
 +
 +// Pass params according to C::test() contract
 +$obj->test(a: "foo", b: "bar");     // Works!
 +// Pass params according to I::test() contract
 +$obj->test(foo: "foo", bar: "bar"); // Also works!
 +</PHP>
 +
 +Here using ''foo'' and ''bar'' as parameter names is allowed, and will be interpreted as ''a'' and ''b'', because there is a parent method using those names. This makes the methods artificially and automagically LSP compatible.
 +
 +Names from parent methods are registered as aliases, but not bound to a specific signature. As such, it's possible (though not recommended) to mix parameter names from different signatures:
 +
 +<PHP>
 +// Use parameter names from both C::test() and I::test()
 +$obj->test(a: "foo", bar: "bar"); // Also works.
 +</PHP>
 +
 +From a design perspective it would be better to forbid such calls, but I don't believe that it is worth the technical and performance cost this would entail.
 +
 +There is one problem with this scheme: What happens if two signatures share the same name at different positions?
 +
 +<PHP>
 +interface I {
 +    public function test($foo, $bar);
 +}
 +
 +class C implements I {
 +    public function test($bar, $foo) {}
 +}
 +
 +// Fatal error: Parameter $foo of C::test() at position #2 conflicts with
 +//              parameter $foo of I::test() at position #1
 +</PHP>
 +
 +In this case, the LSP inheritance checks will report a fatal error. It is expected that this restriction will have much less impact in practice than a blanket prohibition of parameter renames, and that it will mostly point out legitimate LSP violations that hold even in the absence of named arguments. An analysis of affected cases in the top 2k composer packages can be found at https://gist.github.com/nikic/6cc9891381a83b8dca5ebdaef1068f4d. (It should be noted that the analysis is not fully accurate and may have false negatives.)
 +
 +Parameter names from prototype methods can come from a number of sources:
 +
 +  * Parent methods, including grand parents.
 +  * Interface methods, including implementations of the same method from multiple interfaces.
 +  * Abstract trait methods.
 +
 +As such, a single parameter can have a potentially large number of aliases from a large number of prototypes.
 +
 +A case that requires special consideration are parameters that are absorbed by a variadic in a child class:
 +
 +<PHP>
 +class A {
 +    public function method($a) {}
 +}
 +class B extends A {
 +    public function method(...$args) {}
 +}
 +class C extends B {
 +    public function method($c = null, ...$args) {}
 +}
 +
 +(new B)->method(a: 42);
 +(new C)->method(a: 42);
 +</PHP>
 +
 +There are principally two ways in which this might behave:
 +
 +<PHP>
 +// Option A:
 +(new B)->method(a: 42); // $args = [42]
 +(new C)->method(a: 42); // $c = 42, $args = []
 +
 +// Option B:
 +(new B)->method(a: 42); // $args = ['a' => 42]
 +(new C)->method(a: 42); // $c = null, $args = ['a' => 42]
 +</PHP>
 +
 +With option A, we would remember that ''$a'' was the first parameter of a parent method, and as such store the value at offset 0 rather than under the name ''%%"a"%%'' in the variadic parameter. Consequently, in the ''C'' class, the parameter ''$a'' would be considered an alias of ''$c''.
 +
 +With option B, we instead discard parent parameters that are absorbed into a variadic. This means that the parameter ''$a'' will be stored under the name ''%%"a"%%'' in the variadic parameter for both classes ''B'' and ''C''. This is the option I would prefer, as it avoids further special-casing of variadic argument collection.
 +
 +While I think this approach to the LSP problem is conceptually elegant, it turns out that it involves quite a few language design edge cases, as well as non-trivial technical complexity.
 +
 +More importantly, code that renames parameters during inheritance may fall into one of two categories: Either the code is not used with named parameters, in which case the parameter names don't matter in the first place, or it is used with named parameters, in which case the names should really, really be changed to match across the inheritance hierarchy. Implementing this mechanism papers over a migration issue by introducing a core language feature that will have to be supported forever.
  
 ===== Future Scope ===== ===== Future Scope =====
Line 652: Line 732:
  
 Approaching from this angle, the named argument syntax we should use is ''paramName: $value'', or '':$paramName'' for short. Approaching from this angle, the named argument syntax we should use is ''paramName: $value'', or '':$paramName'' for short.
 +
 +==== Positional-only and named-only parameters ====
 +
 +A useful extension of this proposal would be to allow parameters that can only be used positionally, or only using named arguments. This is primarily helpful for API designers, because it gives them more freedom: A positional-only parameter may be freely renamed, while a named-only parameter may be freely reordered.
 +
 +===== Vote =====
 +
 +Voting opened 2020-07-10 and closes 2020-07-24. A 2/3 majority is required.
 +
 +<doodle title="Add named argument support?" auth="nikic" voteType="single" closed="true">
 +   * Yes
 +   * No
 +</doodle>
  
 ===== Changelog ===== ===== Changelog =====
  
 +  * 2020-07-06: Move alternative LSP behavior to "alternatives", it's not part of the main RFC.
 +  * 2020-07-06: Specify that call_user_func etc support named args.
 +  * 2020-07-03: Add information on internal APIs.
 +  * 2020-07-03: Explicitly mention behavior of attributes.
   * 2020-06-23: Add alternative LSP behavior.   * 2020-06-23: Add alternative LSP behavior.
   * 2020-06-23: Remove syntax as open question, specify use of '':''.   * 2020-06-23: Remove syntax as open question, specify use of '':''.
rfc/named_params.1592906299.txt.gz · Last modified: 2020/06/23 09:58 by nikic