rfc:pipe-operator-v2

This is an old revision of the document!


PHP RFC: Pipe Operator v2

Introduction

Code like the following is quite common in procedural code:

getPromotions(mostExpensiveItem(getShoppingList(getCurrentUser(), 'wishlist'), ['exclude' => 'onSale']), $holiday);

That is ugly, error prone, and hard to read or maintain.

Generally, breaking it up as follows will improve readability:

$user = getCurrentUser();
$shoppingList = getShoppingList($user, 'wishlist');
$mostExpensiveItem = mostExpensiveItem($shoppingList, ['exclude' => 'onSale']);
$promotions = getPromotions($mostExpensiveItem, $holiday);

That, however, is still rather verbose and requires defining intermediary variables, and thus either coming up with names for them or using generic placeholder names like `$x`. The result is still error prone and it is possible to get confused by the variable names without realizing it. It's also not intuitively obvious that what's happening is passing the output of one function to the next.

In OOP, it's common to answer “well use a fluent interface,” which might look like this:

$promotions = getCurrentUser()
    ->getShoppingList('wishlist')
    ->mostExpensiveItem(['exclude' => 'onSale'])
    ->getPromotions($holiday);

That's easier to read, but requires very specific methods on very specific objects, which are not always logical or possible to put there.

This RFC aims to improve this type of code with the introduction of a “Pipe Operator,” which is a common approach in many other languages.

Proposal

This RFC introduces a new operator |>, “pipe”. Pipe evaluates left to right by passing the value (or expression result) on the left as the first and only parameter to the callable on the right. That is, the following two code fragments are exactly equivalent:

$result = "Hello World" |> 'strlen';
$result = strlen("Hello World");

For a single call that is not especially useful. It becomes useful when multiple calls are chained together. That is, the following two code fragments are also exactly equivalent:

$result = "Hello World"
    |> 'htmlentities'
    |> 'str_split'
    |> fn($x) => array_map(fn($v) => 'strtoupper', $x)
    |> fn($x) => array_filter($x, fn($v) => $v != 'O');
$result = array_filter(
    array_map('strtoupper', 
        str_split(htmlentities("Hello World"))
        ), fn($v) => $v != 'O'
    );

The left-hand side of the pipe may be any value or expression. The right-hand side may be any valid PHP callable that takes a single parameter, or any expression that evaluates to such a callable. Functions with more than one required parameter are not allowed and will fail as if the function were called normally with insufficient arguments. If the right-hand side does not evaluate to a valid callable it will throw an Error.

While any callable style may be used, in practice the Partial Function Application RFC will enable a very clear and readable syntax for use with pipes. For instance, the previous example could instead be written:

$result = "Hello World"
    |> htmlentities(?)
    |> str_split(?)
    |> array_map(strtoupper(?), ?)
    |> array_filter(?, fn($v) => $v != 'O');

And the example from the start of this RFC could be written as:

$holiday = "Lincoln's Birthday";
$result = getCurrentUser()
   |> getShoppingList(?, 'wishlist')
   |> mostExpensiveItem(?, ['exclude' => 'onSale'])
   |> getPromotions(?, $holiday);

Functions that accept their first parameter by reference are supported, and will behave exactly as if they were called in the normal “inside out” fashion. However, unless they return a value as well they are not of much use.

The pipe operator evaluates immediately. It does not produce a new function. However, it is simple to produce a new function by writing an arrow function:

$holiday = "Lincoln's Birthday";
$new_function = fn($user) => $user
   |> getShoppingList(?, 'wishlist')
   |> mostExpensiveItem(?, ['exclude' => 'onSale'])
   |> getPromotions(?, $holiday);
 
$new_function(getCurrentUser());

More robust example with PSR-7

With partial functions:

ServerRequest::fromGlobals()
    |> authenticate(?)
    |> $router->resolveAction(?)
    |> fn($request) => $request->getAttribute('action')($request)
    |> renderResult(?)
    |> buildResponse(?)
    |> emit(?);

Without partial functions:

ServerRequest::fromGlobals()
    |> 'authenticate'
    |> [$router, 'resolveAction']
    |> fn($request) => $request->getAttribute('action')($request)
    |> 'renderResult'
    |> 'buildResponse'
    |> 'emit';

Alternate comprehension syntax

A prior RFC proposed a dedicated syntax for array comprehensions. The pipe operator would also address that use case, in combination with a few user-space functions (which the RFC author pledges to write and maintain a library for). For example:

// array_map() but for any iterable.
function itmap(callable $c) {
  return function(iterable $it) use ($c) {
    foreach ($it as $val) {
      yield $c($val);
    }
  };
}
 
// array_filter() but for any iterable.
function itfilter(callable $c) {
  return function(iterable $it) use ($c) {
    foreach ($it as $val) {
      if ($c($val)) {
        yield $val;
      }
    }
  };
}
 
// count(), but runs out an iterator to do so.
function itcount(iterable $it) {
  $count = 0;
  foreach ($it as $v) {
    $count++;
  }
  return $count;
}

And now comprehension-like behavior can be written using pipes

$list = [1, 2, 3, 4, 5];
 
$new_list = $list
  |> itmap(fn($x) => $x * 2)
  |> itfilter(fn($x) => $x % 3)
  |> iterator_to_array(?);

Any combination of map, filter, reduce, or other array-oriented operation can be wrapped up this way and added to a pipe chain, allowing a similar result to comprehensions without a one-off syntax, and can be mixed-and-matched with any other callable as appropriate.

For a more robust example, the following routine would, given a directory, give a line count of all files in the directory tree that have a specific extension. (Thanks to Levi Morrison for this example.)

function getLineCount(string $directory, string $ext): int {
  $nonEmptyLines = function(\SplFileInfo $file): iterable {
    try {
      $object = $file->openFile("r");
      $object->setFlags(\SplFileObject::SKIP_EMPTY);
      yield from $object;
    } catch (\Throwable $error) {
      // File system error handling irrelevant for the moment.
    }
  };
 
  return new RecursiveDirectoryIterator('.')
    |> new RecursiveIteratorIterator(?)
    |> itfilter(fn ($file) => $file->getExtension() == $ext)
    |> itmap($nonEmptyLines(?))
    |> itcount(?)
  ;
}
 
print getLineCount('foo/bar/baz', 'php');

Prior art

A previous RFC, Pipe Operator v1 from 2016 by Sara Golemon and Marcelo Camargo, proposed similar functionality. Its primary difference was to model on Hack, which allowed an arbitrary expression on the right-hand side and introduced a new `$$` magic variable as a placeholder for the left-hand side. While promising, the v2 authors concluded that short-lambdas made a custom one-off syntax unnecessary. The semantics proposed here are more consistent with most languages that offer a pipe operator.

Additionally, the comprehension-esque usage noted above would be infeasible with a non-callable right hand side.

Portions of this RFC are nonetheless based on the previous iteration, and the author wishes to thank the v1 authors for their inspiration.

Comparison with other languages

Several languages already support a pipe operator, using similar or identical syntax. In practice, the semantics proposed here are closest to Elixir and F#.

Hacklang

Hack has very similar functionality, also using the `|>` operator. However, in Hack the operator's right-hand side is an arbitrary expression in which a special placeholder, `$$` is used to indicate where the left-hand side should be injected. Effectively it becomes a one-off form of partial application.

That is atypical among languages with such functionality and introduces additional questions about what sigil to use and other implementation details. The RFC authors believe that a fully-fleshed out partial function application syntax (in a separate RFC) is superior, and integrates cleanly with this RFC.

The Hack syntax was the subject of the v1 Pipe Operator RFC.

Haskell

Haskell has a function concatenation operator, `.`. However, its semantics are backwards. `reverse . sort` is equivalent to `reverse(sort())`, not to `sort(reverse())`. It also returns a new composed callable rather than invoking immediately.

The inverse ordering is more difficult to reason about, and unfamiliar for PHP developers. The `.` operator itself would also cause confusion with the string concatenation operator, especially as strings can be callables. That is:

'hello' . 'strlen'

Could be interpreted as evaluating to “hellostrlen” or to int 5. For that reason the `.` operator is not feasible.

Haskell also has a & operator, which is the “reverse application operator.” Its semantics are essentially the same as described here, including listing functions “forward” rather than backward.

F#

F# has no less than four function composition operators: Pipe forward `|>`, Pipe back `<|`, Compose forward `>>` and Compose back `<<`. The two pipe operators apply a value to a function, while the composer operator concatenates two functions to produce a new function that is the composition of the specified functions. The forward and back variants allow you to put the callable on either the left or right-hand side.

The author decided that supporting both forward and back versions was too confusing. Additionally, a concatenation operator is unnecessary since users can simply form a short-lambda closure themselves.

That is, this RFC proposes an equivalent of only the “pipe forward” operator.

Elixir

Elixir has a pipe operator, `|>`, using essentially the same semantics as described here.

Ruby

Ruby 2.6 added a similar syntax, although more akin to F#'s compose forward and compose back operators.

Javascript

A pipeline operator `|>` has been proposed for Javascript. As of this writing it is still in early stages and no implementations support it, but it may get accepted in the future. The semantics are essentially the same as described here.

This RFC is deliberately kept small and contained. However, it naturally complements other RFCs under consideration, by design.

* Generic partial application. This RFC already exists, and will hopefully be approved in the near future. It allows virtually any function to be partially applied to produce a single-parameter function, which is then compatible with |>, as well as referencing an existing single-parameter function by name. The examples further up show how the two RFCs complement each other nicely.

* Short functions. The short-functions RFC would combine with pipe to make writing named functions that simply invoke a pipeline of other functions trivial and nicely compact, like so:

function handle_request(RequestInterface $request) => $request
    |> authenticate(?)
    |> $router->resolveAction(?)
    |> fn($request) => $request->getAttribute('action')($request)
    |> renderResult(?)
    |> buildResponse(?)
    |> emit(?);
 
handle_request(ServerRequest::fromGlobals());

Future Scope

This RFC suggests a number of additional improvements. They have been left for future work so as to keep this RFC focused and non-controversial. Should this RFC pass the authors intend to attempt these follow up improvements. (Assistance in doing so is quite welcome.)

* Iterable right-hand side. The pipe operator as presented here can only be used in a hard-coded fashion. A possible extension is to support an iterable of callables on the right-hand side, allowing for a runtime-defined pipeline.

* A `__bind` method or similar on objects. If implemented by an object on the left-hand side, the right-hand side would be passed to that method to invoke as it sees fit. Effectively this would be operator overloading, which could be part of a second attempt at full operator overloading or a one-off magic method. It could also be implemented as a separate operator instead, for clarity. Such a feature would be sufficient to support arbitrary monadic behavior in PHP in a type-friendly way.

These options are mentioned here for completeness and to give an indication of what is possible, but are *not* in scope and are *not* part of this RFC at this time.

Proposed PHP Version(s)

8.1

Backward compatibility issues

None.

Proposed Voting Choices

Adopt the Pipe Operator yes/no? Requires a 2/3 majority.

Patches and Tests

PR is available here: https://github.com/php/php-src/pull/5425

(It's my first PHP PR. Please be gentle.)

rfc/pipe-operator-v2.1623160283.txt.gz · Last modified: 2021/06/08 13:51 by crell