This is an old revision of the document!
PHP RFC: Pipe operator v3
- Version: 0.9
- Date: 2025-02-05
- Author: Larry Garfield (larry@garfieldtech.com)
- Status: In Discussion
- First Published at: http://wiki.php.net/rfc/pipe-operator-v3
Introduction
In object-oriented code, “composition” generally means “one object having a reference to another.” In functional programming, “composition” generally means “sticking two functions together end-to-end to make a new function.” Both are valid and useful techniques, especially in a multi-paradigm language like PHP.
Composition generally takes two forms: Immediate and delayed. The immediate execution of chained functions is typically implemented with a “pipe” operator. Delayed execution is typically implemented with a composition operator, which takes two functions and produces a new function that will call each one in turn. The combination of the two cleanly enables “point-free style,” an approach to programming that limits the use of unnecessary intermediary variables. Point-free style has been gaining popularity in JavaScript circles, so will be familiar to JavaScript developers using that style.
This RFC introduces the “pipe” operator, in the form used by most other languages with such functionality. A function composition operator is saved for a follow up RFC. (See Future Scope.)
Proposal
This RFC introduces a new operator:
mixed |> callable;
The |>
operator, or “pipe,” accepts a single-parameter callable on the right and passes the left-side value to it, evaluating to the callable's result.
Pipe (|>
) evaluates left to right by passing the value (or expression result) on the left as the first and only parameter to the callable on the right. That is, the following two code fragments are logically equivalent:
$result = "Hello World" |> strlen(...); $result = strlen("Hello World");
For a single call that is not especially useful. It becomes useful when multiple calls are chained together. That is, the following two code fragments are effectively equivalent:
$result = "Hello World" |> 'htmlentities' |> str_split(...) |> fn($x) => array_map(strtoupper(...), $x) |> fn($x) => array_filter($x, fn($v) => $v != 'O');
$temp = "Hello World"; $temp = htmlentities($temp); $temp = str_split($temp); $temp = array_map(strtoupper(...), $temp); $temp = array_filter($temp, fn($v) => $v != 'O'); $result = $temp;
The left-hand side of the pipe may be any value or expression. The right-hand side may be any valid PHP callable that takes a single parameter, or any expression that evaluates to such a callable. Functions with more than one required parameter are not allowed and will fail as if the function were called normally with insufficient arguments. If the right-hand side does not evaluate to a valid callable it will throw an Error.
A pipe chain is an expression, and therefore may be used anywhere an expression is valid.
Precedence
The pipe operator has a deliberately low binding order, so that most surrounding operators will execute first. In particular, arithmetic operations, null coalesce, and ternaries all have higher binding priority, allowing for the RHS to have arbitrarily complex expressions in it that will still evaluate to a callable. For example:
// These are equivalent. $res1 = 5 + 2 |> someFunc(...); $res1 = (5 + 2) |> someFunc(...); // These are equivalent. $res1 = 5 |> $null_func ?? defaultFunc(...); $res1 = 5 |> ($null_func ?? defaultFunc(...)); // These are equivalent. $res1 = 5 |> $config['flag'] ? enabledFunc(...) : disabledFunc(...); $res1 = 5 |> ($config['flag'] ? enabledFunc(...) : disabledFunc(...));
One notable implication of this is that if a pipe chain is placed within a larger expression, it will likely need to be enclosed in ()
or else it will be misinterpreted.
// This $x ? $y |> strlen(...) : $z; // will be interpreted like this: ($x ? $y) |> (strlen(...) : $z); // When what is most likely intended is this: $x ? ($y |> strlen(...)) : $z;
Performance
The current implementation works entirely at the compiler level, and effectively transforms the first example above into the second at compile time. The result is that pipe has virtually no runtime overhead.
Callable styles
Pipe supports any callable syntax supported by PHP. At present, the most common form is first-class-callables (eg, strlen(...)
), which dovetails with this syntax very cleanly. Should further improvements be made in the future, such as a revised Partial Function Application RFC, it would be supported naturally.
References
As usual, references are an issue. Supporting pass-by-ref parameters in simple cases is quite easy, and a naive implementation would support it. However, passing a value from a compound value (an object property or array element) by reference does not work, and throws an “Argument could not be passed by reference” error. In practice, it is easier to forbid pass-by-ref parameters in pipe than to allow them.
$arr = ['a' => 'A', 'b' => 'B']; $val = 'C'; function inc_print(&$v) { $v++; print $v; } // This can be made to work. $val |> inc_print(...); // This cannot be easily made to work, and it might not even be possible. $arr |> inc_print(...);
That is also consistent with the typical usage patterns. The whole design of the pipe operator is that data flows through it from left to right, in pure-functional way. Passing by reference would introduce all sorts of potential “spooky action at a distance.” In practice, there are few if any use cases where it would be appropriate to do in the first place.
For that reason, pass-by-ref callables are disallowed on the right-hand side of a pipe operator. That is, both examples above would error.
Syntax choice
F#, Elixir, and OCaml all use the |>
operator already for this exact same behavior. There has been a long-standing discussion in JavaScript about adding a |>
operator as described here. It is the standard operator for this task.
Use cases
The use cases for a pipe operator are varied. They include, among others, encouraging shallow-function-nesting, encouraging pure functions, expressing a complex process in a single expression, and emulating extension functions.
For example, here are some code fragments from existing projects of mine that use a user-space pipe implementation.
From Crell/Serde
use function Crell\fp\afilter; use function Crell\fp\amap; use function Crell\fp\explode; use function Crell\fp\flatten; use function Crell\fp\implode; use function Crell\fp\pipe; use function Crell\fp\replace; enum Cases implements RenamingStrategy { case Unchanged; case UPPERCASE; case lowercase; case snake_case; case kebab_case; case CamelCase; case lowerCamelCase; public function convert(string $name): string { return match ($this) { self::Unchanged => $name, self::UPPERCASE => strtoupper($name), self::lowercase => strtolower($name), self::snake_case => pipe($name, $this->splitString(...), implode('_'), strtolower(...) ), self::kebab_case => pipe($name, $this->splitString(...), implode('-'), strtolower(...) ), self::CamelCase => pipe($name, $this->splitString(...), amap(ucfirst(...)), implode(''), ), self::lowerCamelCase => pipe($name, $this->splitString(...), amap(ucfirst(...)), implode(''), lcfirst(...), ), }; } /** * @return string[] */ protected function splitString(string $input): array { $words = preg_split( '/(^[^A-Z]+|[A-Z][^A-Z]+)/', $input, -1, /* no limit for replacement count */ PREG_SPLIT_NO_EMPTY /* don't return empty elements */ | PREG_SPLIT_DELIM_CAPTURE /* don't strip anything from output array */ ); return pipe($words, amap(replace('_', ' ')), amap(explode(' ')), flatten(...), amap(trim(...)), afilter(), ); } }
The various imported functions are higher order functions that return a callable suitable for pipe, and the pipe()
function is essentially a user-space implementation of the operator presented here. By using the pipe approach, the different case folding options can be laid out in a clean, linear fashion. The steps of each option are clearly self-evident. They can also be expressed in a single expression, which both reduces visual clutter and allows each pipe to be used in a match()
arm. The result is both more compact and more at-a-glance understandable than a multi-statement approach.
From Crell/MiDy
class PageData { public array $tags { get => pipe( array_merge(...$this->values('tags')), array_unique(...), array_values(...), ); } /** * @param array<string, ParsedFile> $parsedFiles */ public function __construct( private array $parsedFiles, ) {} private function values(string $property): array { return array_column($this->parsedFiles, $property); } } class ParsedFile { public function __construct(public array $tags) {} }
In this (simplified from the actual code) example, the PageData
class's $tags
property is the aggregation of all tags on the files it contains. Again, the use of a pipe makes the logic flow trivially easy to see visually. With a native operator, it could be further simplified to:
public array $tags { get => $this->values('tags') |> fn($tags) => array_merge(...$tags), |> array_unique(...), |> array_values(...), ); }
The single-expression alternative today would be:
public array $tags { get => array_values(array_unique(array_merge(...$this->values('tags')))); } </doe> Which I believe is inarguably worse. A multi-statement version would require: <code php> public array $tags { get { $tags = $this->values('tags'); $tags = array_merge(...$tags); $uniqueTags = array_unique($tags); return array_values($unique_tags); } }
Which is still less readable and less self-evident than the explicit pipe version.
Shallow calls
The use of a pipe for function composition also helps to separate closely related tasks so they can be developed and tested in isolation. For a (slightly) contrived and simple example, consider:
function loadWidget($id): Widget { $record = DB::query("something"); return makeWidget($record); } function loadMany(array $ids): array { $data = DB::query("something"); $ret = []; foreach ($data as $record) { $ret[] = $this->makeWidget($record); } return $ret; } function makeWidget(array $record): Widget // Assume this is more complicated. return new Widget(...$record); }
In this code, it is impossible to test loadWidget()
or loadMany()
without also executing makeWidget()
. While in this trivial example that's not a huge problem, in a more complex example it often is, especially if several functions/methods are nested more deeply. Dependency injection cannot fully solve this problem, unless each step is in a separate class.
By making it easy to chain functions together, however, that can be rebuilt like this:
function loadWidget($id): array { return DB::query("something"); } function loadMany(array $ids): array { return DB::query("something"); $ret = []; foreach ($data as $record) { $ret[] = $this->makeWidget($record); } return $ret; } function makeWidget(array $record): Widget // Assume this is more complicated. return new Widget(...$record); } $widget = loadWidget(5) |> makeWidget(...); $widgets = loadMany([1, 4, 5]) |> fn(array $records) => array_map(makeWidget(...), $records);
And the latter could be further simplified with either a higher-order function (like amap()
seen in the Serde example above) or partial function application. Those chains could also be wrapped up into their own functions/methods for trivial reuse. They can also be extended, too. For instance, the result of loadMany()
is most likely going to be used in a foreach()
loop. That's a simple further step in the chain.
$profit = loadMany([1, 4, 5]) |> fn(array $records) => array_map(makeWidget(...), $records) |> fn(array $ws) => array_filter(isOnSale(...), $ws) |> fn(array $ws) => array_map(sellWidget(...), $ws) |> array_sum(...);
And again, a few simple higher-order utility functions would eliminate the need for the wrapping closures.
$profit = loadMany([1, 4, 5]) |> amap(makeWidget(...)) |> afilter(isOnSale(...)) |> amap(sellWidget(...)) |> array_sum(...);
That neatly encapsulates the entire logic flow of a process in a clear, compact, highly-testable set of operations.
Pseudo-extension functions
“Extension functions” are a feature of Kotlin and C# (and possibly other languages) that allow for a function to act as though it is a method of another object. It has only public-read access, but has the ergonomics of a method. While not a perfect substitute, pipes do offer similar capability with a little more work.
For instance, the above examples included utility functions amap()
and afilter()
. Trivial implementations of those functions are as follows. (More robust versions that handle any iterable are available in the Crell/fp library.)
function amap(callable $c): \Closure { return fn(array $a) => array_map($c, $a); } function afilter(callable $c): \Closure { return fn(array $a) => array_filter($a, $c); }
That allows them to be used, via pipes, in a manner similar to “scalar methods.”
$result = $array |> afilter(is_even(...)) |> amap(some_transformation(...)) |> afilter(a_filter(...));
Which is not far off from what it would look like with scalar methods:
$result = $array ->filter(is_even(...)) ->map(some_transformation(...)) ->filter(a_filter(...));
But can work with any value type, object or scalar. While I do not believe pipes can completely replace extension functions or scalar methods, they provide a reasonable emulation and most of the benefits, for trivial cost.
Existing implementations
Multiple user-space libraries exist in PHP that attempt to replicate pipe-like or compose-like behavior. All are clunky and complex by necessity compared to a native solution. There is clear demand for this functionality, but user-space's ability to provide it is currently limited. This list has only grown since the Pipes v2 RFC, indicating an even stronger benefit to the PHP ecosystem with a solid built-in composition syntax.
- The PHP League has a Pipeline library that encourages wrapping all functions into classes with an
__invoke()
method to allow them to be referenced, and using a->pipe()
call for each step. - Laravel includes a Illuminate/Pipeline package that has an even more cumbersome syntax.
- The PHP Standard Library (PSL) library includes a pipe function, though it is more of a function concatenation operation.
- Sebastiaan Luca has a pipe library that works through abuse of the
__call
method. It only works for named functions, I believe, not for arbitrary callables. - PipePie is another very similar implementation to the previous ones.
- ZenPipe is a new-comer that also uses a method named
pipe()
for what is actually a composition operation. - Crell/fp provides
pipe()
andcompose()
functions that take an array of callables. While the lightest-weight option on this list, that makes dynamically-built pipelines or compositions more cumbersome than the syntax proposed here. - Various blogs speak of “the Pipeline Pattern” (for example), or more recently, Creating a type-safe pipe() in PHP
Those libraries would be mostly obsoleted by this RFC (in combination with the compose follow on, as noted in future-scope), with a more compact, more universal, better-performing syntax.
Why in the engine?
The biggest limitation of any user-space implementation is performance. Even the most minimal implementation (Crell/fp) requires adding 2-3 function calls to every operation, which is relatively expensive in PHP. A native implementation would not have that additional overhead. Crell/fp also results in somewhat awkward function nesting, like this:
pipe($someVal, htmlentities(...), str_split(...), fn($x) => array_map(strtoupper(...), $x), fn($x) => array_filter($x, fn($v) => $v != 'O'), ); // (Or worse if you need conditional stages.)
More elaborate implementations tend to involve magic methods (which are substantially slower than normal function/method calls) or multi-layer middlewares, which are severe overkill for sticking two functions together.
A native implementation eliminates all of those challenges.
Additionally, a native operator would make it much easier for static analysis tools to ensure compatible types. The SA tools would know the input value's type, in most cases the callable type on the RHS, and could compare them directly without several layers of obfuscated user-space function calls between them.
Future Scope
There are a number of potential improvements to this feature that have been left for later, as their implementation would be notably more involved than this RFC. The author believes they would be of a benefit in their own RFCs.
A compose operator for closures (likely +
). Where pipe executes immediately, compose creates a new callable (Closure) that composes two or more other Closures. That allows a new operation to be defined simply and easily and then saved for later in a variable. Because it is “just” an operator, it is compatible with all other language features. That means, for example, conditionally building up a pipeline is just a matter of throwing if
statements around as appropriate. The author firmly believes that a compose operator is a necessary companion to pipe, and the functionality will be incomplete without it. However, while pipe can be implemented trivially in the compile step, a compose operator will require non-trivial runtime work. For that reason it has been split out to its own RFC.
Generic partial function application. While the prior RFC was declined due to its perceived use cases being insufficient to justify its complexity, there was clear interest in it, and it would vastly improve the usability of function composition. If a less complex implementation can be found, it would most likely pass and complement this RFC well.
A __bind
method or similar on objects, possibly with a dedicated operator of its own (such as >>=
). If implemented by an object on the left-hand side, the right-hand side would be passed to that method to invoke as it sees fit. Such a feature would be sufficient to support arbitrary monadic behavior in PHP in a type-friendly way.
Backward Incompatible Changes
None
Proposed PHP Version(s)
8.5
Open Issues
None
Proposed Voting Choices
Yes or no vote. 2/3 required to pass.
Patches and Tests
* PR is available here: https://github.com/php/php-src/pull/17118
Implementation
After the project is implemented, this section should contain
- the version(s) it was merged into
- a link to the git commit(s)
- a link to the PHP manual entry for the feature
- a link to the language specification section (if any)
References
Links to external references, discussions or RFCs