rfc:function-composition

PHP RFC: Function composition operators

Introduction

In object-oriented code, “composition” generally means “one object having a reference to another.” In functional programming, “composition” generally means “sticking two functions together end-to-end to make a new function.” Both are valid and useful techniques, especially in a multi-paradigm language like PHP. However, the latter is currently not directly supported by the language. This RFC aims to correct that gap, with two new operators that work on callables.

This RFC cleanly enables “point-free style,” an approach to programming that limits the use of unnecessary intermediary variables. Point-free style has been gaining popularity in JavaScript circles, so will be familiar to JavaScript developers using that style.

Proposal

This RFC proposes two new operators

mixed |> callable;
 
callable + callable;

The |> operator, or “pipe,” accepts a single-parameter callable on the right and passes the left-side value to it, evaluating to the callable's result.

The + operator, or “compose,” when applied to two single-parameter callables, evaluates to a new Closure object that contains the two referenced callables. When invoked with a single argument, it will pass that argument to the first callable, then the result of that will be passed to the second callable, and return the result. If either callable is already such a composed Closure, it will be “flattened” into just being a longer array of callables.

Pipe

Pipe (|>) evaluates left to right by passing the value (or expression result) on the left as the first and only parameter to the callable on the right. That is, the following two code fragments are exactly equivalent:

$result = "Hello World" |> strlen(...);
$result = strlen("Hello World");

For a single call that is not especially useful. It becomes useful when multiple calls are chained together. That is, the following two code fragments are also exactly equivalent:

$result = "Hello World"
    |> 'htmlentities'
    |> str_split(...)
    |> fn($x) => array_map(strtoupper(...), $x)
    |> fn($x) => array_filter($x, fn($v) => $v != 'O');
$temp = "Hello World";
$temp = htmlentities($temp);
$temp = str_split($temp);
$temp = array_map(strtoupper(...), $temp);
$temp = array_filter($temp, fn($v) => $v != 'O');
$result = $temp;

The left-hand side of the pipe may be any value or expression. The right-hand side may be any valid PHP callable that takes a single parameter, or any expression that evaluates to such a callable. Functions with more than one required parameter are not allowed and will fail as if the function were called normally with insufficient arguments. If the right-hand side does not evaluate to a valid callable it will throw an Error.

A pipe chain is an expression, and therefore may be used anywhere an expression is valid.

Compose

The compose operator (+) is very similar conceptually to pipe, but does not invoke immediately. Rather, it creates a new callable (Closure) that composes two or more other callables. That allows a new operation to be defined simply and easily and then saved for later in a variable. Because it is “just” an operator, it is compatible with all other language features. That means, for example, conditionally building up a pipeline is just a matter of throwing if statements around as appropriate.

Additionally, += will work as well, as it is just a shorthand for adding two values and assigning the result to the first variable.

To reuse the example from pipes:

$processor = htmlentities(...)
    + str_split(...)
    + fn($x) => array_map(strtoupper(...), $x);
 
if ($some_flag) {
    $processor += fn($x) => array_filter($x, fn($v) => $v != 'O');
}
 
$result1 = $processor('some string');
$result2 = $processor('other string');

This is far more flexible than the method chaining and embedded-conditional methods often used in user-space implementations today (see below).

Both the left and right side of the compose operator may be any valid PHP callable that takes a single parameter, or any expression that evaluates to such a callable. Functions with more than one required parameter are not allowed and will fail as if the function were called normally with insufficient arguments. If the right-hand side does not evaluate to a valid callable it will throw an Error.

As compose is an expression, it may be used anywhere an expression is valid.

The result of adding two callables is an object approximately equivalent to (though this isn't quite the implementation):

readonly class ComposedClosure
{
    private array $callables = [];
 
    // This is called when the left-side variable is not a ComposedClosure yet.
    public static function make(callable $left, callable $right)
    {
        $this->callables[] = $left;
        $this->callables[] = $right;
    }
 
    public function __invoke(mixed $arg): mixed
    {
        foreach ($this->callables as $fn) {
            $arg = $fn($arg);
        }
        return $arg;
    }
 
    // This is an operator.
    public function add(callable $next): self
    {
        $new = new self();
        $new->callables = [...$this->callables, $next];
        return $new;
    }
}

Callable styles

Both pipe and compose support any callable syntax supported by PHP. At present, the most common form is first-class-callables (eg, strlen(...)), which dovetails with this syntax very cleanly. Should further improvements be made in the future, such as a revised Partial Function Application RFC, it would be supported naturally.

Syntax choice

The use of |> for pipe is obvious. F#, Elixir, and OCaml all use that operator already for the exact same behavior. There has been a long-standing discussion in JavaScript about adding a |> operator as described here. It is the standard operator for this task.

For composition, there is a bit less standardization. Haskell uses ., but it reads “backwards” from what is proposed here. F# uses >>, as does Ruby.

The main limitation is that both strings and arrays may be valid callables in PHP, so the operator chosen must be distinguishable from those use cases.

. is the string concatenation operator, meaning 'htmlentities' . 'strlen' would be ambiguous. Does that concatenate two strings, or compose two functions? It's unclear, and therefore not a viable option.

>> is a bit-wise shift operator, which is currently useless except on numeric values. It provides no behavior for strings, arrays, or objects (eg, Closures), making it a safe addition. However, it has no natural “append” behavior to update a ComposedClosure in place. The natural first choice would be >>=, but that is well-known as the “bind” operator in Haskell, which does something more advanced. We're hesitant to claim that operator for a different behavior, especially when it may be useful for a bind operator in the future.

+ naturally provides += as an append, and is fairly self-evident to read. It has no defined behavior on strings, either. However, on arrays, it currently triggers a Warning and renders the string “Array”, so [$obj1, 'foo'] + [$obj2, 'bar'] would result in the string “ArrayArray”, and two triggered warnings. As the current behavior is functionally useless and already an error condition, we consider it safe to redefine into something meaningful.

Existing implementations

Multiple user-space libraries exist in PHP that attempt to replicate pipe-like or compose-like behavior. All are clunky and complex by necessity compared to a native solution. There is clear demand for this functionality, but user-space's ability to provide it is currently limited. This list has only grown since the Pipes v2 RFC, indicating an even stronger benefit to the PHP ecosystem with a solid built-in composition syntax.

  • The PHP League has a Pipeline library that encourages wrapping all functions into classes with an __invoke() method to allow them to be referenced, and using a ->pipe() call for each step.
  • Laravel includes a Illuminate/Pipeline package that has an even more cumbersome syntax.
  • The PHP Standard Library (PSL) library includes a pipe function, though it is more of a function concatenation operation.
  • Sebastiaan Luca has a pipe library that works through abuse of the __call method. It only works for named functions, I believe, not for arbitrary callables.
  • PipePie is another very similar implementation to the previous ones.
  • ZenPipe is a new-comer that also uses a method named pipe() for what is actually a composition operation.
  • Crell/fp provides pipe() and compose() functions that take an array of callables. While the lightest-weight option on this list, that makes dynamically-built pipelines or compositions more cumbersome than the syntax proposed here.
  • Various blogs speak of “the Pipeline Pattern” (for example), or more recently, Creating a type-safe pipe() in PHP

Those libraries would be mostly obsoleted by this RFC, with a more compact, more universal, better-performing syntax.

Why in the engine?

The biggest limitation of any user-space implementation is performance. Even the most minimal implementation (Crell/fp) requires adding 2-3 function calls to every operation, which is relatively expensive in PHP. A native implementation would not have that additional overhead. Crell/fp also results in somewhat awkward function nesting, like this:

pipe($someVal,
    htmlentities(...),
    str_split(...),
    fn($x) => array_map(strtoupper(...), $x),
    fn($x) => array_filter($x, fn($v) => $v != 'O'),
);
 
// (Or worse if you need conditional stages.)

More elaborate implementations tend to involve magic methods (which are substantially slower than normal function/method calls) or multi-layer middlewares, which are severe overkill for sticking two functions together.

A native implementation eliminates all of those challenges.

Additionally, a native operator would make it much easier for static analysis tools to ensure compatible types. The SA tools would know the input value's type, in most cases the callable type on the RHS, and could compare them directly without several layers of obfuscated user-space function calls between them.

Future Scope

There are a number of potential improvements to this feature that have been left for later, as their implementation would be notably more involved than this RFC. The authors believe they would be of a benefit in their own RFCs.

* Generic partial function application. While the prior RFC was declined due to its perceived use cases being insufficient to justify its complexity, there was clear interest in it, and it would vastly improve the usability of function composition. If a less complex implementation can be found, it would most likely pass and complement this RFC well.

* A __bind method or similar on objects, possibly with a dedicated operator of its own (such as >>=). If implemented by an object on the left-hand side, the right-hand side would be passed to that method to invoke as it sees fit. Such a feature would be sufficient to support arbitrary monadic behavior in PHP in a type-friendly way.

Backward Incompatible Changes

None

Proposed PHP Version(s)

8.5

Open Issues

Still deciding between + and >> for composition.

Future Scope

Proposed Voting Choices

Yes or no vote. 2/3 required to pass.

Patches and Tests

Implementation

After the project is implemented, this section should contain

  1. the version(s) it was merged into
  2. a link to the git commit(s)
  3. a link to the PHP manual entry for the feature
  4. a link to the language specification section (if any)

References

Links to external references, discussions or RFCs

Rejected Features

rfc/function-composition.txt · Last modified: 2024/12/30 04:41 by crell