rfc:array_group

PHP RFC: Introduce array_group and array_group_pair grouping functions

Introduction

We consider the grouping of elements within an array a very basic and common functionality.

(There is some related discussion on the mailing list, discussing the idea of this RFC.)

Almost all mainstream programming languages have a built-in grouping functionality:

The way this has been approached in PHP in the past was to just implement a custom function and use it. This has several drawbacks:

  • It affects the developer experience negatively, forcing developers to have repeated code across different codebases
  • Performance reasons: A particular benchmark shows about 25% improvement in C over PHP code (more benchmarks can/will be added later)

Proposal

There are many ways to perform a grouping of an array. As discussed on the mailing list, two of the most common ones are:

  • The JavaScript/Scala/etc. approach - A function accepting a callback that accepts one argument (the element) and returns a string, which indicates where this element will be stored in the final hash array. The final result will be a hashmap of string -> arrays.
  • The Haskell approach - A function accepting a callback that accepts two arguments (previous element, current element) and returns a boolean - indicating the relation between these two elements and whether they should be grouped or not. The final result will be an array of arrays.

To cover these two main considerations, we propose implementing both array_group and array_group_pair, for the first and the second case respectively.

function array_group(array $array, callable $callback): array {}
 
function array_group_pair(array $array, callable $callback): array {}

An example of calling array_group:

$groups = array_group($arr1, function( $x ) {
  return (string) strlen( $x );
} );
// Producing ['3' => ['one', 'two'], '5' => ['three']]

An example of calling array_group_pair:

$arr = [-1,2,-3,-4,2,1,2,-3,1,1,2];
 
$groups = array_group_pair( $arr, function( $p1, $p2 ) {
  return ($p1 > 0) == ($p2 > 0);
} );
// Producing [[-1],[2],[-3,-4],[2,1,2],[-3],[1,1,2]]

Backward Incompatible Changes

Similarly to introducing any other new function to PHP, this could cause breakages in codebases where the functions array_group or array_group_pair are defined, in which case the user would receive the “Cannot redeclare function” error.

Doing a quick GitHub search for array_group shows about 1k such functions, and 0 functions for array_group_pair.

Proposed PHP Version(s)

Next PHP 8.x (current version is 8.2.5).

Proposed Voting Choices

Include these so readers know where you are heading and can discuss the proposed voting options.

Patches and Tests

Implementation

After the project is implemented, this section should contain

  1. the version(s) it was merged into
  2. a link to the git commit(s)
  3. a link to the PHP manual entry for the feature
  4. a link to the language specification section (if any)

References

RFC discussion on the mailing list: https://externals.io/message/120451

Rejected Features

Keep this updated with features that were discussed on the mail lists.

rfc/array_group.txt · Last modified: 2023/06/01 00:16 by bor0