rfc:any_all_on_iterable

This is an old revision of the document!


PHP RFC: PHP\iterable\any() and all() on iterables

Introduction

The primitives any() and all() are a common part of many programming languages and help in avoiding verbosity or unnecessary abstractions.

For example, the following code could be shortened significantly:

// The old version
$satisifes_predicate = false;
foreach ($item_list as $item) {
    // Performs DB operations or external service requests, stops on first match by design.
    if (API::satisfiesCondition($item)) {
        $satisfies_predicate = true;
        break;
    }
}
if (!$satisfies_predicate) {
    throw new APIException("No matches found");
}
// more code....
// The new version is much shorter, readable, and easier to review,
// without creating temporary variables or helper functions that are used in only one place.
 
// Performs DB operations or external service requests, stops on first match by design.
if (!\PHP\iterable\any($item_list, fn($item) => API::satisfiesCondition($item))) {
    throw new APIException("No matches found");
}

Proposal

Add the functions PHP\iterable\any(iterable $input, ?callable $callback = null): bool and all(...) to php's standard function set. (The namespace PHP\iterable was preferred in a straw poll that was previously sent out)

namespace PHP\iterable;
 
/** Determines whether any element of the iterable satisfies the predicate. */
function any(iterable $input, ?callable $callback = null) {
    foreach ($input as $v) {
        if ($callback !== null ? $callback($v) : $v) {
            return true;
        }
    }
    return false;
}
/** Determines whether all elements of the iterable satisfy the predicate */
function all(iterable $input, ?callable $callback = null) {
    foreach ($input as $v) {
        if (!($callback !== null ? $callback($v) : $v)) {
            return false;
        }
    }
    return true;
}

This proposal recommends adding PHP\iterable\any() and PHP\iterable\all() to the standard library instead of a PECL or composer library for the following reasons

  1. New contributors to projects wouldn't know about any() and all() if those functions were reimplemented in various composer libraries or util.php files with different semantics/names and only occasionally used.
  2. If this was provided only in userland, there'd be low adoption and code such as the above example (API::somePredicate()) would remain common.
  3. If the standard library provided it, then polyfills for newer php functionality could adopt this as well, making cleaner code easier to write.

Secondary Vote: any()/all() or any_value()/all_values()

A secondary vote will be held on whether to name this any()/all() or any_value()/all_values()

PHP is unique in that the primitive array-like type array type is also a dictionary, making the keys often significant (strings, numeric identifiers, etc). Existing function names vary in whether the fact that they only act on values is explicitly included in the name.

Many other programming languages have gone with a short name for the default of checking if a value is in a collection.

The primitives any() and all() are a common part of many programming languages and help in avoiding verbosity or unnecessary abstractions.

Benefits of a shorter name:

  1. Conciseness for the most common use of any* and all*
  2. Consistency with some other functions such as array_reduce(), in_array(), next() that act only on values
  3. Potential to use $flags to extend this to support less common use cases like ARRAY_FILTER_USE_KEY/ARRAY_FILTER_USE_BOTH without adding more global functions

Benefits of a longer name:

  1. Descriptiveness/harder to misread.
  2. More likely to add multiple functions such as any_key()/(any_entry or any_key_value) in the future (which will be simpler to statically analyze or infer types for - in rare cases the argument $flags passed to array_filter($values, $callback, $flags) is an unknown dynamic value).
    Note that adding the name any() for values of iterables doesn't prevent PHP from adding any_key() for checking keys of iterables in the future, either.

Backward Incompatible Changes

Any userland functions called PHP\iterable\any() and PHP\iterable\all() in the global namespace without a !function_exists() check would encounter duplicate function errors. Because the PHP namespace is reserved for internal use by PHP, this is unlikely.

Proposed PHP Version(s)

8.1

Future Scope

Add int $flag = 0?

Similar to array_filter, int $flag = 0 could be used to control which parameters get passed to the predicate such as ARRAY_FILTER_USE_BOTH and ARRAY_FILTER_USE_KEY.

Because there was discussion of whether the ability to pass keys was widely useful and multiple approaches that could be used to pass the iterable key, this functionality was left out of this RFC. See https://externals.io/message/111711#111721

I like this, but I do not like the flags. I don't think they're at all useful. A lot of the other discussion in the thread seems to be needlessly complicating it, too.

all() and any() only need return booleans. Their callbacks only need return booleans. That's the point. first() makes sense to add, and it would return the first value that matches.

For the callback itself, there is work to, hopefully, add partial function application to 8.1. (No idea if it will be successful, but the effort is in progress.) If so, the upshot is that turning an arbitrary function into a single-parameter function becomes silly easy, which means functions like this can just expect a single parameter callback and be done with it. No need for extra-args or flags or whatnot.

If you want to check the keys of an array, call array_keys() first and use that.

if (any(array_keys($foo), fn($k) => $k %2)) { ... }

all(), any(), and first() all sound like good things to include, but let's not over-complicate them. We can do better today than we could in 1999...

--Larry Garfield

Add first($iterable, $callback = null, $default = null): mixed as well?

https://externals.io/message/111711#111732

If it took the default value as well it could return that (to distinguish the absence of a result from null matching the predicate). While it's useful in itself it also would enable you to pass a marker object and check the identity of that to know if no matches have been found:

$none = new stdClass;
$element = first($collection, fn($elt) => ..., $none);
if ($element === $none) {
    // nothing found
}

Calling it [iterable_]search_callback() or first_match[ing]() or find() might help distinguish this from the reset()/end()/next()/prev() family of global functions - there's more than one possible name.

Discussion

Alternative names

any_value() or all_values() have been suggested as alternative names: https://github.com/php/php-src/pull/6053#issuecomment-684164832

I suggest slightly different signatures, assuming we stay value-oriented:

// ...omitted
 
// with named parameters
all_values(of: [1, 3, 5, 7], satisfy: 'is_odd');
any_value(of: [0, 2, 4, 6], satisfies: 'is_prime');
 
// without named parameters
all_values([1, 3, 5, 7], 'is_odd');
any_value([0, 2, 4, 6], 'is_prime');

The naming clarifies what any and all are about--the values--and leaves room for naming functions that are key or key/value oriented.

iter_any() or iterable_any() have also been suggested as alternative names.

The main thing I'm concerned about is that once we start extending this area (I assume that any & all are not going to be the last additions in this space) we will quickly run into function names that are either too generic or outright collide. For example, what if we want to add an iterator-based version of range()? Do we really want to be forced to pull a Python and call it xrange()? That's about as good as real_range()...

As such, I think it's important to prefix these somehow, though I don't care strongly how. Could be iter_all() or iterable_all(). We might even make it iterator_all() if we also adjust other existing iterator_* functions to accept iterables. I'd also be happy with iter\all() or iterable\all(), but that gets us back into namespacing discussions :)

Because any() and all() are potentially commonly used functions in the same way as count(Countable|array) and always return booleans, I preferred a short name over longer names. This also allows potentially supporting int $flags = 0 in the future, similar to what was done for array_filter().

Initially, the proposal was to add this in the global scope as iterable_all() and iterable_any().

Add find_first() instead?

I was actually working on this sort of thing recently. Technically, you can support all, any, and first by using a single function:

function find_first(iterable $of, callable($value, $key): bool $thatSatistifes): Iterator

It converts the $iterable into an Iterator, then calls the callback for each key/value pair until one returns true, and then always returns the iterator at the current position.

This allows you to know both key and value when making a decision. By returning an iterator the caller can get both key and value. By returning an iterator it can handle both the empty case and not found cases with $result->valid() === false. By returning an iterator it might be useful for processing the remainder of the list somehow. I'm not sure that in practice it would be that friendly, but it's worth pointing out for discussion at least.

Proposed Voting Choices

Add PHP\iterable\any(iterable $input, ?callable $callback = null) and PHP\iterable\all(...) (yes/no, requiring 2/3 majority)

A secondary vote requiring a simple majority will be used to decide between any()/all() and any_value()/all_values() as the name within the PHP\iterable namespace. See the discussion section for the benefits/drawbacks of those names.

References

  1. https://externals.io/message/111711 “Proposal: Adding functions any(iterable $input, ?callable $cb = null, int $use_flags=0) and all(...)”
  2. https://externals.io/message/103357 “[PATCH] Implementing array_every() and array_any()”
  3. https://externals.io/message/111756 “[RFC] Global functions any() and all() on iterables”

Rejected Features

Adding flags like ''array_filter()'' was left out of this RFC due to debate over how often it would be used in practice and moved to future scope.

Changelog

0.3: Add more quotes 0.4: Change name to PHP\iterable\all and PHP\iterable\any

rfc/any_all_on_iterable.1612622236.txt.gz · Last modified: 2021/02/06 14:37 by tandre