rfc:foreach-non-scalar-keys

This is an old revision of the document!


Allow non-scalar keys in ''foreach''

  • version 1.0
  • Date: 2013-01-28
  • Authors: Levi Morrison levim@php.net, Nikita Popov nikic@php.net
  • Status: Voting
  • Target version: PHP 5.5

Current situation

The Iterator::key function can currently return a value of any type, but the handling code in foreach and several other places only allows integer and string keys to be used. This limitation makes some use-cases unnecessarily complicated. From the SPL two examples are MultipleIterator and SplObjectStorage.

The MutlipleIterator allows you to traverse several Iterators at the same time. It's ::current method returns an array of values and the ::key method returns an array of keys. But due to the foreach key type limitation the keys can not be directly fetched:

$it = new MultipleIterator;
$it->attachIterator($it1);
$it->attachIterator($it2);
 
// This is NOT possible
foreach ($it as $keys => $values) {
    // ...
}
 
// Instead you have to use this
foreach ($it as $values) {
    $keys = $it->keys();
 
    // ...
}

SplObjectStorage is a map/set implementation for object keys. Here the issue is circumvented by returning the keys as values and requiring a manual lookup on the values:

// NOT possible
foreach ($objectStore as $key => $value) {
    // ...
}
 
// Instead you have to use
foreach ($objectStore as $key) {
    $value = $objectStore[$key];
 
    // ...
}

These are just two examples from core classes, but it obviously also applies in many other cases (and now that we have generators, it will probably become an even larger issue).

Another key issue is that you can't really work around this generically. If you want to write code that is also compatible with Iterators that return array/object keys, you can no longer use the foreach ($it as $k ⇒ $v) syntax. You are forced to use foreach ($it as $v) { $k = $it->key(); ... }, but this will obviously only with with Iterators and not with aggregates, Traversables or normal arrays. In order to properly support all use cases you'd have to wrap everything in iterators (i.e. make extensive use of IteratorIterator and ArrayIterator), which obviously is an option, but cumbersome to a degree that nobody does it. What this means is that iterators like MultipleIterator are to a large part excluded from use in iterator chaining/pipelines (which is probably the most important thing about using iterators).

Suggested fix

This RFC proposes to lift the restriction and allow values of arbitrary types to be used as keys (in particularly allowing also arrays and objects) in iterators. (Note: This proposal does not suggest allowing those key types in arrays. This is only about Iterators.)

In order to remove this restriction the internal ''zend_object_iterator_funcs'' API has to be changed:

// This entry:
int (*get_current_key)(zend_object_iterator *iter, char **str_key, uint *str_key_len, ulong *int_key TSRMLS_DC);
// Is replaced with this entry:
zval *(*get_current_key)(zend_object_iterator *iter TSRMLS_DC);

The handler will return a zval* with already increased refcount. The zval* may not be NULL, unless an exception was thrown.

The signature can use zval* instead of zval** because by-ref modification of keys is not possible. The refcount is increased in the handler (instead of the VM) as that turned out to be the more convenient method for this use case.

iterator_to_array()

When using non-string/int keys iterator_to_array with the $preserve_keys option will behave in the same way as PHP would when it does normal array key assignments, i.e. its behavior would be the same as the following PHP snippet:

function iterator_to_array($iter) {
    foreach ($iter as $k => $v) {
        $array[$k] = $v;
    }
    return $array;
}

For array and object keys this would give an Illegal offset type warning. For NULL the "" key is used, doubles are truncated to the integral part, resources use their resource ID and issue a warning, booleans are cast to integers.

In order to support this a new function is added in Zend/zend_API.h (which more or less reimplements the internal inline function zend_fetch_dimension_address_inner):

/* The refcount of value is incremented by the function itself */
ZEND_API int array_set_zval_key(HashTable *ht, zval *key, zval *value);

Patch

A preliminary patch implementing the above proposal can be found here: https://github.com/php/php-src/pull/278

The change itself is rather small, but there are quite a few extensions that require minor adjustments to use the new API.

Vote

Voting ends on March 6th. A 50% + 1 majority is required. This RFC targets PHP 5.5.

Remove type-restrictions on foreach keys?
Real name Yes No
ab (ab)  
arpad (arpad)  
ashnazg (ashnazg)  
colder (colder)  
datibbaw (datibbaw)  
dm (dm)  
frozenfire (frozenfire)  
gwynne (gwynne)  
indeyets (indeyets)  
ircmaxell (ircmaxell)  
jpauli (jpauli)  
juliens (juliens)  
jwage (jwage)  
kalle (kalle)  
levim (levim)  
nikic (nikic)  
pajoye (pajoye)  
patrickallaert (patrickallaert)  
peehaa (peehaa)  
treffynnon (treffynnon)  
willfitch (willfitch)  
Count: 21 0
rfc/foreach-non-scalar-keys.1361993484.txt.gz · Last modified: 2013/02/27 20:31 by nikic