rfc:argument_unpacking
no way to compare when less than two revisions

Differences

This shows you the differences between two versions of the page.


Previous revision
Next revision
rfc:argument_unpacking [2013/09/23 19:38] – make the example for multiple ... and mixing a bit less contrived nikic
Line 1: Line 1:
 +====== PHP RFC: Argument Unpacking ======
 +  * Date: 2013-08-30
 +  * Author: Nikita Popov <nikic@php.net>
 +  * Status: Under Discussion
 +  * Proposed for: PHP 5.6
 +  * Patch: https://github.com/nikic/php-src/compare/variadics...splat
 +  * Mailing list discussion: http://markmail.org/message/dxae5ybjldg6pftp
  
 +===== Introduction =====
 +
 +This RFC complements the [[rfc:variadics|variadics RFC]]. It introduces a syntax for unpacking arrays and Traversables into argument lists (also known as "splat operator", "scatter operator" or "spread operator").
 +
 +As a usage example, consider a variadic method ''%%public function query($query, ...$params)%%''. You are provided a ''$query'' and an array of ''$params'' and want to call the method using these. Currently this is possible using ''call_user_func_array()'':
 +
 +<code php>
 +call_user_func_array([$db, 'query'], array_merge(array($query), $params));
 +</code>
 +
 +This RFC proposes a syntax for unpacking arguments directly in the call syntax:
 +
 +<code php>
 +$db->query($query, ...$params);
 +</code>
 +
 +===== Proposal =====
 +
 +An argument in a function call that is prefixed by ''%%...%%'' will be "unpacked": Instead of passing the argument itself to the function the elements it contains will be passed (as individual arguments). This works both for arrays and Traversables.
 +
 +As such all of the following function calls are equivalent:
 +
 +<code php>
 +function test(...$args) { var_dump($args); }
 +
 +test(1, 2, 3);                         // [1, 2, 3]
 +test(...[1, 2, 3]);                    // [1, 2, 3]
 +test(...new ArrayIterator([1, 2, 3])); // [1, 2, 3]
 +
 +// Note: It doesn't really make sense to unpack a constant array like [1, 2, 3].
 +//       Normally these would unpack some variable like ...$args
 +</code>
 +
 +It's possible to use ''%%...%%'' multiple times in a call and it is possible to mix it with normal arguments:
 +
 +<code php>
 +$args1 = [1, 2, 3];
 +$args2 = [4, 5, 6];
 +test(...$args1, ...$args2); // [1, 2, 3, 4, 5, 6]
 +test(1, 2, 3, ...$args2);   // [1, 2, 3, 4, 5, 6]
 +test(...$args1, 4, 5, 6);   // [1, 2, 3, 4, 5, 6]
 +</code>
 +
 +The ''%%...%%'' operator works in all argument lists, including ''new'' expressions:
 +
 +<code php>
 +fn(...$args);
 +$fn(...$args);
 +$obj->fn(...$args);
 +ClassName::fn(...$args);
 +new ClassName(...$args);
 +</code>
 +
 +Argument unpacking is not limited to variadic functions, it can also be used on "normal" functions:
 +
 +<code php>
 +function test($arg1, $arg2, $arg3 = null) {
 +    var_dump($arg1, $arg2, $arg3);
 +}
 +
 +test(...[1, 2]);       // 1, 2
 +test(...[1, 2, 3]);    // 1, 2, 3
 +test(...[1, 2, 3, 4]); // 1, 2, 3 (remaining arg is not captured by the function declaration)
 +</code>
 +
 +If you try to unpack something that is not an array or Traversable a warning is thrown, but apart from that the call continues as usual:
 +
 +<code php>
 +var_dump(1, 2, ...null, 3, 4);
 +// Warning: Only arrays and Traversables can be unpacked
 +// int(1) int(2) int(3) int(4)
 +</code>
 +
 +==== By-reference passing ====
 +
 +If an array is unpacked the elements will by passed by-value/by-reference according to the function definition:
 +
 +<code php>
 +function test($val1, $val2, &...$refs) {
 +    foreach ($refs as &$ref) ++$ref;
 +}
 +
 +$array = [1, 2, 3, 4, 5];
 +test(...$array);
 +var_dump($array); // [1, 2, 4, 5, 6]
 +</code>
 +
 +By-reference passing will not work if the unpacked entity is a Traversable. Instead an ''E_STRICT'' level error is thrown and the argument is passed by-value instead:
 +
 +<code php>
 +test(...new ArrayIterator([1, 2, 3, 4, 5]));
 +// Strict standards: Cannot pass by-reference argument 3 of test() by unpacking a Traversable, passing by-value instead
 +</code>
 +
 +The reasons why we can't pass by-reference from a Traversable are two-fold: 
 +
 +  * It's not possible to determine the number of elements in a Traversable ahead of time. As such we can not know whether unpacking the Traversable will or will not hit a by-reference argument.
 +  * It's not possible to determine if a Traversable has support for by-reference iteration or if it will trigger an error if this is requested.
 +
 +===== Backward Compatibility =====
 +
 +This change does not break userland or internal compatibility.
 +
 +===== Advantages over call_user_func_array =====
 +
 +Usage of ''call_user_func_array'' becomes complicated if you need to pass fixed arguments as well. Compare:
 +
 +<code php>
 +call_user_func_array([$db, 'query'], array_merge(array($query), $params));
 +// vs
 +$db->query($query, ...$params);
 +</code>
 +
 +''call_user_func_array'' requires a callback. So even if the called function/method is known, you still need to use a dynamic string/array callback. This usually precludes any IDE support.
 +
 +''call_user_func_array'' does not work for constructors. Instead ''ReflectionClass::newInstanceArgs()'' has to be used:
 +
 +<code php>
 +(new ReflectionClass('ClassName'))->newInstanceArgs($args);
 +// vs
 +new ClassName(...$args);
 +</code>
 +
 +Futhermore ''call_user_func_array'' has a rather large performance impact. If a large number of calls go through it, this can make a signficant difference. For this reason projects ((I've seen this used at least in Laravel and Drupal and a bunch of other code)) often replace particularly common ''call_user_func_array'' calls with a switch statement of the following form:
 +
 +<code php>
 +switch (count($args)) {
 +    case 0: $func(); break;
 +    case 1: $func($args[0]); break;
 +    case 2: $func($args[0], $args[1]); break;
 +    case 3: $func($args[0], $args[1], $args[2]); break;
 +    case 4: $func($args[0], $args[1], $args[2], $args[3]); break;
 +    case 5: $func($args[0], $args[1], $args[2], $args[3], $args[4]); break;
 +    default: call_user_func_array($func, $args); break;
 +}
 +</code>
 +
 +The ''%%...%%'' argument unpacking syntax is about 3.5 to 4 times faster than ''call_user_func_args''. This solves the performance issue. [[https://gist.github.com/nikic/6390366|Benchmark code and results]].
 +
 +Lastly, it seems that people naturally expect that this syntax is present if the variadics syntax is present. So if we implement variadics, it's probably best to include this as well.
 +
 +===== Examples =====
 +
 +The code samples in the "Proposal" section are rather technical and not code you would actually write. This section contains a few more practical examples of this feature.
 +
 +==== Extending variadic functions: forwarding ====
 +
 +The introduction already mentioned ''%%$db->query($query, ...$params)%%'' as an example. At this point you could wonder: Why would I want to write code like that? Why should I have the parameters only as an array?
 +
 +One case where this occurs is when extending variadic functions:
 +
 +<code php>
 +class MySqlWithLogging extends MySql {
 +    protected $logger;
 +    public function query($query, ...$params) {
 +        $this->logger->log(
 +            'Running query "%s" with parameters [%s]',
 +            $query, implode(', ', $params)
 +        );
 +        
 +        return parent::query($query, ...$params);
 +    }
 +}
 +</code>
 +
 +The above code sample extends the variadic ''query()'' method with logging and needs to forward all arguments to the parent function.
 +
 +==== Partial application: multiple unpacks ====
 +
 +Some people were wondering on what occasion you would ever want to unpack *two* arguments in one function call. An example of such a usage is "partial application".
 +
 +If you are not familiar with the concept, partial application allows you to "bind" arguments to a function:
 +
 +<code php>
 +$arrayToLower = bind('array_map', 'strtolower');
 +
 +$arrayToLower(['Foo', 'BAR', 'baZ']); // returns ['foo', 'bar', 'baz']
 +
 +// The above $arrayToLower call resolves to:
 +// array_map('strtolower', ['Foo', 'BAR', 'baZ'])
 +</code>
 +
 +This is a common functional paradigm, but rather rarely used in PHP. Anyway, an "old-style" (no variadic syntax, no argument unpacking) definition of the ''bind()'' function would look like this:
 +
 +<code php>
 +function bind(callable $function) {
 +    $boundArgs = array_slice(func_get_args(), 1);
 +    return function() use ($function, $boundArgs) {
 +        return call_user_func_array(
 +            $function, array_merge($boundArgs, func_get_args())
 +        );
 +    }
 +}
 +</code>
 +
 +And the "new-style" definition (with variadic syntax and argument unpacking) looks like this:
 +
 +<code php>
 +function bind(callable $function, ...$boundArgs) {
 +    return function(...$args) use($function, $boundArgs) {
 +        return $function(...$boundArgs, ...$args);
 +    }
 +}
 +</code>
 +
 +===== Patch =====
 +
 +The diff can be found here: https://github.com/nikic/php-src/compare/variadics...splat
 +
 +The patch is based off the variadics implementation, but can also be implemented without it.
 +
 +===== Support in other languages =====
 +
 +This feature is supported by many languages. Some of the more important ones being:
 +
 +  * [[http://docs.python.org/2/tutorial/controlflow.html#unpacking-argument-lists|Python]] using the ''*args'' syntax
 +  * [[http://endofline.wordpress.com/2011/01/21/the-strange-ruby-splat/#calling_methods|Ruby]] using Python's syntax
 +  * Java supports this, but only for variadic parameters and without any special syntax (type based)
 +  * JavaScript ([[http://wiki.ecmascript.org/doku.php?id=harmony:spread|ECMAScript Harmony]]) using the same syntax proposed here
rfc/argument_unpacking.txt · Last modified: 2017/09/22 13:28 by 127.0.0.1