Table of Contents

Request for Comments: Iteration tools in PHP

This RFC proposes a series of functions or classes to facilitate easy processing of data sets represented as either arrays or Traversables. The assumed PHP version is 5.3 because of the new lambda structures that inspired this proposal.

Introduction

Most of the programs we write, invariable of the programming language we use, have as their purpose data processing. While this data can be represented in various formats it is as true that quite often this data comes grouped as sets of data. For example it is a common situation to issue a database query which returns a result set or just as well we might need to read the contents of a directory or the structure of an XML document. Oblivious of the data source these result sets are represented in PHP in three ways:

In order to read these data structures PHP offers us 3 looping structures:

Depending on the task at hand the processing involved inside these loops may be ridiculously easy or painfully hard. With time, the more you do this the more you realize there's a pattern emerging out there and there must be some “tools” to ease our job.

Why do we need tools for iteration

Given that iteration is such a recurrent situation and conforming to the DRY principle, but also in total respect with common sense, an abstraction is required. Thankfully, patterns regarding iteration were observed almost forty years ago by some very smart people. These people found out that iterative processes can be abstracted away in a handful of functions. For example:

The list may go on with a few other abstracted use cases.

It turns out that separating the iteration from the inner data calculations is a good thing and people came up with some higher order functions, that took at least two parameters, the data set to be processed and the *function* that did the processing (which in some of the cases were “unnamed” functions or lambdas). Languages that did not have possibilities for higher order functions made use of their best features and provided different alternatives if any. For example in PHP we have at our disposal the following SPL classes revolving around the same idea:

Shortcomings of current tools

While these classes do their job they have some shortcomings:

What I'm proposing is introducing in the language of the following functions, which are similar to the Array methods that exist in JavaScript 1.8:

Common Misconceptions

None that I know of yet.

Proposal and Patch

Pages from Mozilla Developer Center wiki documenting these kind of functions can be found here:

Some PHP function signatures, mostly identical to the JavaScript versions, modified where necessary because of PHP related aspects:

map()

walk()

walkRecursive()

reduce()

reduceRight()

filter()

some()

every()

Although the above tools were listed as functions, as they don't do that much, they might just as well be class constructors (honestly I don't like this approach). I thought functions could do just fine because of the new namespace support that's why I represented them as such.

Additionally, one thing I haven't represented in the above signatures is that an additional argument may be passed to the callback function representing an iteration counter.

Use cases

This an example PHP script for reading files with .php extension from a certain directory.

<?php
 
// 1.1 How it could be done right now -------------------------------------------------
class OnlyPHPFiles extends FilterIterator {
    public function accept() {
        $ext = strtolower(pathinfo($this->getRealPath(), PATHINFO_EXTENSION));
        return $ext === 'php';
    }
}
 
$dirs = array();
foreach (new OnlyPHPFiles(new DirectoryIterator(__DIR__)) as $file) {
    $dirs[] = $file;
}
 
// 1.2 Using CallbackFilterIterator which I don't know with which PHP
// version it will be shipped
$dirs = new CallbackFilterIterator(new DirectoryIterator(__DIR__), function($value) {
    $ext = strtolower(pathinfo($this->getRealPath(), PATHINFO_EXTENSION));
    return $ext === 'php';
});
 
 
// 2.1 How it could be done with my proposal --------------------------------------
$dirs = filter(new DirectoryIterator(__DIR__), function($current, $key, $iterator) {
    $ext = strtolower(pathinfo($this->getRealPath(), PATHINFO_EXTENSION));
    return $ext === 'php';
});

While the 1.2 example is very similar to 2.1 it differs from it in that it's not passing the iterator to the callback function and, of course, the fact that I use a function instead of an object. Another difference is that present implementation of CallbackFilterIterator (as documented on http://www.php.net/~helly/php/ext/spl/) may also be used as a virtual CallbackMapIterator. For example, it's not only filtering the elements of the iterator into a new iterator, but it MAY also change those values. In my proposal, the function that changes values is map() which translates a certain value to another depending on the callback function. filter() only keeps items that validate against certain criteria inside the callback function. I believe a clear distinction between these features must be reflected in the API, thus two different functions in my proposal.

Some additional benefits

As you have seen, my proposal includes a function called walk() which does exactly the same thing as a foreach construct. The real useful thing this function provides is the ability to mimic scope inside a foreach block. So, for example:

foreach ($iter as $elem) {
    // everything inside this foreach block is in the global space
}
 
// whereas
 
$global_var = 'foo';
 
walk($iter, function($elem) use ($global_var) {
    // this is not the global space
    // but we may still use variables from the global space
    // by "use"-ing them
});

It would have been an advantage if our lambdas had been self-executing, just like in JavaScript, but they aren't:

// as far as I know this doesn't work in PHP 5.3
foreach ($iter as $elem) {
    (function($elem) use ($global_var) {
        // not global scope
    })();
}

Rejected Features

None for the moment

Similar implementations

JavaScript 1.8:

Python: