rfc:deque

This is an old revision of the document!


PHP RFC: final class Collections\Deque

Introduction

While SplDoublyLinkedList and its subclass SplQueue/SplStack exist in the SPL, they have several drawbacks (that are addressed by this RFC to add a Deque class):

  1. SplDoublyLinkedList is internally represented by a doubly linked list, making it use roughly twice as much memory as the proposed Deque (in typical 64-bit php builds)
  2. SplDoublyLinkedList is an Iterator, not an IteratorAggregate, so different parts of the code can't independently iterate over the same SplDoublyLinkedList instance while iteration is in progress.
  3. push/pop/unshift/shift from SplDoublyLinkedList are slower due to needing to allocate or free the linked list nodes.
  4. Reading values in the middle of the SplDoublyLinkedList is proportional to the length of the list, due to needing to traverse the linked list nodes.
  5. foreach iteration behavior cannot be understood without knowing what most recently constructed the SplDoublyLinkedList instance or set the flags.

It would be useful to have an efficient Deque container in the standard library to provide an alternative without those drawbacks, as well as for the following reasons:

  1. To provide a better alternative to SplDoublyLinkedList, SplStack, and SplQueue for use cases that require stacks or queues.
  2. As a more efficient option than array and SplDoublyLinkedList as a queue or Deque, for workflows requiring shift or unshift. Repeatedly shifting/unshifting from the front of an array is slow, because each shift/unshift requires moving every element of the array.
  3. To be able to reclaim memory in applications that run for long periods of time. Notably, PHP's array type will currently never release allocated capacity.

A Deque is more efficient than an array when used as a queue, more readable, and easier to use correctly. While it is possible to efficiently remove elements from the start of an array (in terms of insertion order), it is very inefficient to prepend elements to the start of a large array due to needing to either copy the array or move all elements in the internal array representation, and an array would use much more memory than a Deque when used that way (and be slower).

There are also several pitfalls to using an array as a queue for larger queue sizes, some of which are not obvious and discovered while writing the benchmarks. (Having a better (double-ended) queue datastructure (Collections\Deque) than the SplDoublyLinkedList would save users from needing to write code with these pitfalls):

  1. array_key_first() takes time proportional to the number of elements unset from the start of an array, causing it to unexpectedly be extremely slow (linear to remove an element, and quadratic time to remove all elements) after unsetting many elements at the start of the queue. (when the array infrequently runs out of capacity, buckets are moved to the beginning of the array)
  2. reset() also takes time proportional to the number of elements unset from the start of an array, taking longer if many elements are removed from the beginning of the array.
  3. reset() or end() will convert a variable to a reference, adding some performance overhead and making opcache less efficient at optimizing uses of variables using references.
  4. More obviously, array_unshift and array_shift will take time proportional to the number of elements in the array (to reindex and move existing/remaining elements) (linear time), compared to amortized constant time for Collections\Deque.

Proposal

This proposes adding the class final class Collections\Deque to PHP. From the Wikipedia article for the Double-Ended Queue:

In computer science, a double-ended queue (abbreviated to deque, pronounced deck, like “cheque”) is an abstract data type that generalizes a queue, for which elements can be added to or removed from either the front (head) or back (tail).

(omitted sections, see article)

The dynamic array approach uses a variant of a dynamic array that can grow from both ends, sometimes called array deques. These array deques have all the properties of a dynamic array, such as constant-time random access, good locality of reference, and inefficient insertion/removal in the middle, with the addition of amortized constant-time insertion/removal at both ends, instead of just one end. Three common implementations include:

  1. Storing deque contents in a circular buffer, and only resizing when the buffer becomes full. This decreases the frequency of resizings.
  2. (omitted, see article)

Similarly to deques in other languages (such as Python collections.deque, C++ std::deque, Rust std::collections::VecDeque, etc.) this is backed by a memory-efficient representation with good cache locality and provides constant amortized-time push/pop/shift/unshift operations. (this RFC's implementation is a circular buffer represented as a raw C array of values with a size, offset, and capacity)

This reclaims memory that is no longer used (when less than a quarter of the capacity is used), unlike array

Similarly to ArrayObject and SplDoublyLinkedList, the $x[] = $value shorthand for appending to the Deque is supported in addition to ArrayAccess functionality.

The proposed API is:

namespace Collections;
 
/**
 * A double-ended queue (Typically abbreviated as Deque, pronounced "deck", like "cheque")
 * represented internally as a circular buffer.
 *
 * This has much lower memory usage than SplDoublyLinkedList or its subclasses (SplStack, SplStack),
 * and operations are significantly faster than SplDoublyLinkedList.
 *
 * See https://en.wikipedia.org/wiki/Double-ended_queue
 *
 * This supports amortized constant time pushing and popping onto the start (i.e. start, first)
 * or back (i.e. end, last) of the Deque.
 *
 * Method naming is based on https://www.php.net/spldoublylinkedlist
 * and on array_push/pop/unshift/shift/ and array_key_first/array_key_last.
 */
final class Deque implements IteratorAggregate, Countable, JsonSerializable, ArrayAccess
{
    /** Construct the Deque from the values of the Traversable/array, ignoring keys */
    public function __construct(iterable $iterator = []) {}
    /**
     * Returns an iterator that accounts for calls to shift/unshift tracking the position of the start of the Deque.
     * Calls to shift/unshift will do the following:
     * - Increase/Decrease the value returned by the iterator's key()
     *   by the number of elements added/removed to/from the start of the Deque.
     *   (`$deque[$iteratorKey] === $iteratorValue` at the time the key and value are returned).
     * - Repeated calls to shift will cause valid() to return false if the iterator's
     *   position ends up before the start of the Deque at the time iteration resumes.
     * - They will not cause the remaining values to be iterated over more than once or skipped.
     */
    public function getIterator(): \InternalIterator {}
    /** Returns the number of elements in the Deque. */
    public function count(): int {}
    /** Returns true if there are 0 elements in the Deque. */
    public function isEmpty(): bool {}
    /** Removes all elements from the Deque. */
    public function clear(): void {}
 
    public function __serialize(): array {}
    public function __unserialize(array $data): void {}
    /** Construct the Deque from the values of the array, ignoring keys */
    public static function __set_state(array $array): Deque {}
 
    /** Appends value(s) to the end of the Deque. */
    public function push(mixed ...$values): void {}
    /** Prepends value(s) to the start of the Deque. */
    public function unshift(mixed ...$values): void {}
    /**
     * Pops a value from the end of the Deque.
     * @throws \UnderflowException if the Deque is empty
     */
    public function pop(): mixed {}
    /**
     * Shifts a value from the start of the Deque.
     * @throws \UnderflowException if the Deque is empty
     */
    public function shift(): mixed {}
 
    /**
     * Peeks at the value at the start of the Deque.
     * @throws \UnderflowException if the Deque is empty
     */
    public function first(): mixed {}
    /**
     * Peeks at the value at the end of the Deque.
     * @throws \UnderflowException if the Deque is empty
     */
    public function last(): mixed {}
 
    /**
     * Returns a list of the elements from the start to the end.
     */
    public function toArray(): array {}
 
    // Must be mixed for compatibility with ArrayAccess
    /**
     * Returns the value at offset (int)$offset (relative to the start of the Deque)
     * @throws \OutOfBoundsException if the value of (int)$offset is not within the bounds of this vector
     */
    public function offsetGet(mixed $offset): mixed {}
    /**
     * Returns true if `0 <= (int)$offset && (int)$offset < $this->count().
     */
    public function offsetExists(mixed $offset): bool {}
    /**
     * Sets the value at offset $offset (relative to the start of the Deque) to $value
     * @throws \OutOfBoundsException if the value of (int)$offset is not within the bounds of this vector
     */
    public function offsetSet(mixed $offset, mixed $value): void {}
    /**
     * @throws \RuntimeException unconditionally because unset and null are different things, unlike SplFixedArray
     */
    public function offsetUnset(mixed $offset): void {}
 
    /**
     * This is JSON serialized as a JSON array with elements from the start to the end.
     */
    public function jsonSerialize(): array {}
}

Iteration behavior

Collections\Deque is an IteratorAggregate where getIterator returns an InternalIterator. Different parts of the code can independently iterate through instances of the same Deque (e.g. nested for loops).

  • The value is the value found in the Deque at the time InternalIterator::current() is called. (or null if the iterator is invalid). (E.g. if you mutate the deque, you will see the new value when iterations such as foreach reach that value)
  • The key is the offset in the Deque at which the value would be found at the time InternalIterator::key() is called. (or null if the iterator is invalid)
  • shift or unshift changing the position of the start of the deque are accounted for - offsets in the deque won't be skipped or processed multiple times.
  • InternalIterator::valid() returns true if the InternalIterator's offset (computed from relative position to the start of the Deque) is within the Deque.
  • InternalIterator::next() unconditionally advances the InternalIterator's position.

The Deque WebAssembly demo can be used to preview the behavior of iteration and other functions

<?php
$deque = new Collections\Deque(['first', 'second', 'third']);
foreach ($deque as $offset => $value) {
    printf("value: %6s deque[offset=%s]=%s\n", $value, $offset, $deque[$offset]);
    if ($value === 'second') {
        printf("=> Shifting while iterating: removed %s\n", $deque->shift());
    } elseif ($value === 'third') {
        printf("=> Push while iterating\n");
        $deque->push('fourth');
    }
}
echo "\nAfter iteration\n";
var_dump($deque);
/*
value:  first deque[offset=0]=first
value: second deque[offset=1]=second
=> Shifting while iterating: removed first
value:  third deque[offset=1]=third
=> Push while iterating
value: fourth deque[offset=2]=fourth
 
After iteration
object(Collections\Deque)#1 (3) {
  [0]=>
  string(6) "second"
  [1]=>
  string(5) "third"
  [2]=>
  string(6) "fourth"
}
*/

Arguments for adding this

What applications would benefit from Deque?

As mentioned in https://en.wikipedia.org/wiki/Queue_(abstract_data_type)

Queues provide services in computer science, transport, and operations research where various entities such as data, objects, persons, or events are stored and held to be processed later. In these contexts, the queue performs the function of a buffer. Another usage of queues is in the implementation of breadth-first search.

Some examples of how queues are used in php applications and libraries are below:

  1. Composer: https://github.com/composer/composer/blob/2.1.8/src/Composer/DependencyResolver/RuleSetGenerator.php#L157-L203 uses a SplQueue internally as a work queue in its dependency resolution logic.
  2. Event-Driven frameworks: https://github.com/reactphp/event-loop/blob/v1.2.0/src/Tick/FutureTickQueue.php uses SplQueue internally to hold callback functions to call later in the order they were added.
  3. Networking code: Predis, a userland client for the Redis database, currently uses SplQueue to build a pipeline of commands and associate pipelined outgoing requests with their incoming responses in the order the requests were sent. https://github.com/predis/predis/blob/v1.1.7/src/Pipeline/Pipeline.php#L121-L150

Once applications and libraries such as those were able to use Collections\Deque (e.g. when the minimum PHP version requirement was raised), they would benefit from the speedup and reduced memory that Collections\Deque has over SplDoublyLinkedList/SplQueue.

Looking at the benchmarks section, SplQueue(and the parent class SplDoublyLinkedList) is typically slower than array (though array has higher reported memory usage). By introducing a data structure (Deque) that's even faster and uses less memory than an array for use as a double-ended queue, even more applications would benefit from it.

  • Additionally, Deque would allow some code to be refactored or written in a more readable way in cases where php developers would previously be avoiding using SplDoublyLinkedList (or SplStack/SplQueue) due to older datastructures having time and memory usage inefficiencies.

https://www.php.net/manual/en/class.splqueue.php#116422 is another example

You can use shift/unshift and push/pop to dequeue/undequeue and queue/unqueue respectively. Really handy for those applications that use sockets where you might not know you can't send data until you attempt to.

for example, this is a function for an application that will un-dequeue the remainder of the data if socket_write indicated it did not write the entire contents of the provided data.

(omitted code snippet from https://www.php.net/manual/en/class.splqueue.php#116422)

Even users that don't write code using Deque may still benefit from its inclusion, due to improved performance and reduced memory usage in the applications and libraries that do adopt it.

Why use this instead of SplDoublyLinkedList?

  1. Pushing/popping/shifting/unshifting takes roughly 30% less time.
  2. Deque typically uses less memory than SplDoublyLinkedList.
  3. The iteration order is predictable, unlike SplDoublyLinkedList, where it can be modified with flags
  4. Constant-time access for reading or modifying elements at any position in the Deque, unlike SplDoublyLinkedList where that would require walking the linked list.

Why use this instead of array?

  1. Faster than using array for queue-like workloads (e.g. around 13%-37% less time to add to the end and remove from the start depending on Deque size and access pattern in an array implementation that correctly avoids linear-time operations).
  2. It is impossible to prepend to an array (i.e. to be first in insertion order) in constant time. array_unshift takes time proportional to the length of an array. See Benchmarks: Unshifting n values to the start then shifting n values from the start
  3. Deque uses much less memory than an array when used as a queue (by unsetting), especially since that will eventually convert an array to an associative array. See https://www.npopov.com/2014/12/22/PHPs-new-hashtable-implementation.html and benchmarks. Note that starting in php 8.2, arrays that are lists (with no/few gaps) are represented in a more memory efficient way than associative arrays.
  4. Modifying an object by value can be faster than modifying an array by reference. Sometime, it is necessary for a library to modify a passed in collection/array in place (e.g. as an internal implementation detail) (e.g. appending to a list of errors and returning a boolean, converting a binary tree to a list of nodes where some predicate applies). The only way for a method to modify a passed in array parameter is by reference, and opcache is not able to optimize reference parameters/variables because their type at runtime cannot be guaranteed, passing in an object by value can be much faster than passing an array by reference.

    (Other approaches may end up copying arrays repeatedly, and take quadratic time instead of linear time, causing performance issues for large inputs).

    (Right now, some libraries do pass arrays by reference, because of how inefficient SplDoublyLinkedList is expected to be to create, modify, and read)

Also see the RFC introduction for pitfalls with trying to use a PHP array like a queue.

Caveats

For stack-like workflows that only require pushing and popping from the end of a data structure, array uses roughly the same amount of memory as Deque and is faster than Deque due to Use more compact representation for packed arrays. being merged into PHP 8.2

Usage

Examples of how each of these methods in Deque can be used can be found at https://www.php.net/spldoublylinkedlist - in most cases this can be used as a more efficient substitute for SplDoublyLinkedList (excluding relatively rare use cases such as removing from the middle of a SplDoublyLinkedList during iteration).

See Implementation for the public API of Deque.

Implementation Choices

"Collections\" namespace

This name was chosen in https://wiki.php.net/rfc/deque_straw_poll#arguments_foragainst_collections_deque - this has a different enough implementation from SplQueue/SplDoublyLinkedList and other data structures in the spl that a different naming pattern or namespace made sense for future additions.

There has also been widespread interest in adopting namespaces for new modules/categories of functionality. https://wiki.php.net/rfc/namespaces_in_bundled_extensions recently passed and allows for using namespaces in new categories of functionality.

Accepting an iterable

This accepts the values of the iterable in the order of iteration. Keys of the iterable are ignored (Because this is meant to be a double-ended queue, adding placeholders would not make sense if there were gaps in the array.)

Final Class

If this were overridable, this would have the following drawbacks:

  1. There would not be as strong guarantees to readers of PHP code using Deque (or even opcache, if optimizations were added targeting objects) that elements were actually a vector or that certain methods would/wouldn't throw certain exceptions, or that iteration would be possible. (if getIterator, ArrayAccess methods, etc. were overridable)
  2. More memory and runtime checks would be required to check if this was the original class or a subclass when fetching a value
  3. More memory would be required to mitigate any unexpected issues seen in the way subclasses overrid or used the base class.
  4. There would be a larger chance the implementation would have discovered or undiscovered bugs due to userland subclasses of Deque, in serialization/unserialization, reads or writes, resizing, future functionality added to PHP, opcache (if opcache optimizations were added), or future methods added to Deque, or causes that were not even thought of yet.

Making all functionality final turns out to be the same approach that the PECL project https://github.com/php-ds/ext-ds used for its datastructures. The php-ds's maintainers reasons are mentioned in https://medium.com/@rtheunissen/efficient-data-structures-for-php-7-9dda7af674cd and summarized or quoted below

  • “Prefer composition over inheritance”
  • “Inheritance would also introduce unnecessary internal complexity”
  • Avoid multiple methods doing the same thing (e.g. SplDeque has a redundant enqueue methods to add to the end of an array because the base class already had push())
  • “They are designed to be self-contained, much like an array. You can't extend an array, so we design our own APIs around it by using an internal array to store the actual data.”

push/pop/shift/unshift (and first()/last())

This is consistent with the name used for array_push()/array_pop()/array_shift()/array_unshift()/array_key_first()/array_key_last(), as well as names used for SplDoublyLinkedList

Backward Incompatible Changes

The class name \Collections\Deque is now used by PHP, and it will be a compilation error to declare classlikes of the same name in the namespace Collections since the class already exists.

Proposed PHP Version(s)

8.2

RFC Impact

To Opcache

None

Unaffected PHP Functionality

PHP's type system remains unchanged (e.g. array) - final class Collections\Deque is a class, and instances of Collections\Deque are ordinary objects.

Benchmarks

Unshifting n values to the start then shifting n values from the start

This benchmark uses array_unshift/array_shift to test the most natural way of removing from the start of an array, not the most optimized. Subsequent benchmarks use a less obvious but more time-optimized way to remove elements from the start of an array. key() and do not renumber for removing from the start of an array.

This takes quadratic time for an array - at n=1000, array takes 15.030 seconds, and Collections\Deque takes 0.126 seconds for the 2000 iterations (Deque is 120 times faster for n=1000).

Test creating a collection, then two cycles of push+shift (unshifting n values to the start of collections then shifting all of them from front of collection)
Moving the front of an array requires moving all elements to their new position, so the total time for array is quadratic
Results for php 8.2.0-dev debug=false with opcache enabled=true

2x Unshift then shift from start of array:               n=       1 iterations= 8000000
=> max memory=     248 bytes, create+destroy time=1.560 shift time = 0.916 total time = 2.475 result=0
2x Unshift then shift from start of Collections\Deque:   n=       1 iterations= 8000000
=> max memory=     144 bytes, create+destroy time=1.516 shift time = 0.801 total time = 2.316 result=0
2x Unshift then shift from start of Ds\Deque:            n=       1 iterations= 8000000
=> max memory=     216 bytes, create+destroy time=2.245 shift time = 0.916 total time = 3.161 result=0
2x Unshift then shift from start of SplDoublyLinkedList: n=       1 iterations= 8000000
=> max memory=     184 bytes, create+destroy time=1.893 shift time = 0.905 total time = 2.798 result=0

2x Unshift then shift from start of array:               n=       4 iterations= 2000000
=> max memory=     248 bytes, create+destroy time=0.938 shift time = 0.631 total time = 1.569 result=24000000
2x Unshift then shift from start of Collections\Deque:   n=       4 iterations= 2000000
=> max memory=     144 bytes, create+destroy time=0.578 shift time = 0.500 total time = 1.078 result=24000000
2x Unshift then shift from start of Ds\Deque:            n=       4 iterations= 2000000
=> max memory=     216 bytes, create+destroy time=1.069 shift time = 0.528 total time = 1.598 result=24000000
2x Unshift then shift from start of SplDoublyLinkedList: n=       4 iterations= 2000000
=> max memory=     280 bytes, create+destroy time=0.927 shift time = 0.513 total time = 1.440 result=24000000

2x Unshift then shift from start of array:               n=       8 iterations= 1000000
=> max memory=     248 bytes, create+destroy time=0.942 shift time = 0.595 total time = 1.537 result=56000000
2x Unshift then shift from start of Collections\Deque:   n=       8 iterations= 1000000
=> max memory=     208 bytes, create+destroy time=0.376 shift time = 0.444 total time = 0.820 result=56000000
2x Unshift then shift from start of Ds\Deque:            n=       8 iterations= 1000000
=> max memory=     216 bytes, create+destroy time=0.797 shift time = 0.469 total time = 1.266 result=56000000
2x Unshift then shift from start of SplDoublyLinkedList: n=       8 iterations= 1000000
=> max memory=     408 bytes, create+destroy time=0.754 shift time = 0.472 total time = 1.226 result=56000000

2x Unshift then shift from start of array:               n=      64 iterations=  100000
=> max memory=    1368 bytes, create+destroy time=2.628 shift time = 1.093 total time = 3.721 result=403200000
2x Unshift then shift from start of Collections\Deque:   n=      64 iterations=  100000
=> max memory=    1104 bytes, create+destroy time=0.167 shift time = 0.292 total time = 0.459 result=403200000
2x Unshift then shift from start of Ds\Deque:            n=      64 iterations=  100000
=> max memory=    1112 bytes, create+destroy time=0.507 shift time = 0.334 total time = 0.841 result=403200000
2x Unshift then shift from start of SplDoublyLinkedList: n=      64 iterations=  100000
=> max memory=    2200 bytes, create+destroy time=0.448 shift time = 0.307 total time = 0.755 result=403200000

2x Unshift then shift from start of array:               n=     128 iterations=   50000
=> max memory=    2648 bytes, create+destroy time=4.772 shift time = 1.790 total time = 6.562 result=812800000
2x Unshift then shift from start of Collections\Deque:   n=     128 iterations=   50000
=> max memory=    2128 bytes, create+destroy time=0.161 shift time = 0.290 total time = 0.451 result=812800000
2x Unshift then shift from start of Ds\Deque:            n=     128 iterations=   50000
=> max memory=    2136 bytes, create+destroy time=0.497 shift time = 0.325 total time = 0.822 result=812800000
2x Unshift then shift from start of SplDoublyLinkedList: n=     128 iterations=   50000
=> max memory=    4248 bytes, create+destroy time=0.440 shift time = 0.302 total time = 0.743 result=812800000

2x Unshift then shift from start of array:               n=    1024 iterations=    2000
=> max memory=   20568 bytes, create+destroy time=11.368 shift time = 3.662 total time = 15.030 result=2095104000
2x Unshift then shift from start of Collections\Deque:   n=    1024 iterations=    2000
=> max memory=   16464 bytes, create+destroy time=0.042 shift time = 0.084 total time = 0.126 result=2095104000
2x Unshift then shift from start of Ds\Deque:            n=    1024 iterations=    2000
=> max memory=   16472 bytes, create+destroy time=0.145 shift time = 0.095 total time = 0.240 result=2095104000
2x Unshift then shift from start of SplDoublyLinkedList: n=    1024 iterations=    2000
=> max memory=   32920 bytes, create+destroy time=0.137 shift time = 0.093 total time = 0.230 result=2095104000

****** Skip array n=1048576 iterations=20 quadratic time too slow ******
2x Unshift then shift from start of Collections\Deque:   n= 1048576 iterations=      20
=> max memory=16777320 bytes, create+destroy time=0.835 shift time = 0.954 total time = 1.789 result=21990211584000
2x Unshift then shift from start of Ds\Deque:            n= 1048576 iterations=      20
=> max memory=16777328 bytes, create+destroy time=1.891 shift time = 0.990 total time = 2.881 result=21990211584000
2x Unshift then shift from start of SplDoublyLinkedList: n= 1048576 iterations=      20
=> max memory=33554584 bytes, create+destroy time=1.465 shift time = 0.978 total time = 2.443 result=21990211584000

Two cycles of appending n values then shifting them from the start

Note that it is possible to have constant time removal from the start of a PHP array efficiently (as long as key stays at the start of the array), but it is not possible to have constant time prepending (`unshift`) to the start of an array. array_unshift is a linear time operation (takes time proportional to the current array size). So unshift is not included in these benchmarks.

Because there's a second cycle in this benchmark, array becomes an associative array and uses more memory than a packed array (https://www.npopov.com/2014/12/22/PHPs-new-hashtable-implementation.html). (At the time of writing, a packed array uses double the memory of a Deque, though there is an unrelated change in review proposing reducing the memory usage of packed arrays).

memory_get_usage is not counting the memory overhead of tracking the allocations of a lot of small objects, so the memory usage of SplDoublyLinkedList is under-reported. SplQueue is a subclass of SplDoublyLinkedList and I expect it would have the same performance.

Deque and array both always capacities that are powers of 2. This benchmark tests the best-case memory usage for Deque and array

Deque is faster than SplDoublyLinkedList at all sizes, and faster than array at medium and large sizes. The maximum memory usage is also noticeably lower than both array and SplDoublyLinkedList

Caveats for comparison with [https://github.com/php-ds/ext-ds](Ds\Deque PECL):

  • Ds\Deque was compiled and installed as a shared extension with phpize; ./configure; make install (shared reflects the default way to install PECL modules, e.g. what pecl install or an OS package manager would do or what copying the windows DLL would do), not statically compiled into php
  • Both Deque and Ds\Deque use circular buffers, so the performance is expected to be about the same.
  • This php-src PR has micro-optimizations such as storing the mask (instead of capacity) for faster bitwise ands, as well as the use of the EXPECTED/UNEXPECTED macros to hint the fast path to the compiler.
  • Having the extra step of installing a PECL may discourage application/library authors from writing code using a PECL or discourage some users from using those applications/libraries. While php-ds has a polyfill, the polyfill would be slower than native solutions - https://github.com/php-ds/ext-ds#compatibility
Test creating a collection, then two cycles of push+shift (adding n values to the end of collections then shifting all of them from the start of the collection)
Results for php 8.2.0-dev debug=false with opcache enabled=true
 
2x Push then shift from array:               n=       1 iterations= 8000000
=> max memory=     216 bytes, create+destroy time=1.125 shift time = 1.017 total time = 2.142 result=0
2x Push then shift from Collections\Deque:   n=       1 iterations= 8000000
=> max memory=     144 bytes, create+destroy time=1.567 shift time = 0.867 total time = 2.434 result=0
2x Push then shift from Ds\Deque:            n=       1 iterations= 8000000
=> max memory=     216 bytes, create+destroy time=1.904 shift time = 0.969 total time = 2.873 result=0
2x Push then shift from SplDoublyLinkedList: n=       1 iterations= 8000000
=> max memory=     184 bytes, create+destroy time=1.988 shift time = 0.941 total time = 2.929 result=0
 
2x Push then shift from array:               n=       4 iterations= 2000000
=> max memory=     216 bytes, create+destroy time=0.394 shift time = 0.671 total time = 1.065 result=24000000
2x Push then shift from Collections\Deque:   n=       4 iterations= 2000000
=> max memory=     144 bytes, create+destroy time=0.509 shift time = 0.458 total time = 0.967 result=24000000
2x Push then shift from Ds\Deque:            n=       4 iterations= 2000000
=> max memory=     216 bytes, create+destroy time=0.611 shift time = 0.528 total time = 1.139 result=24000000
2x Push then shift from SplDoublyLinkedList: n=       4 iterations= 2000000
=> max memory=     280 bytes, create+destroy time=0.993 shift time = 0.510 total time = 1.503 result=24000000
 
2x Push then shift from array:               n=       8 iterations= 1000000
=> max memory=     376 bytes, create+destroy time=0.293 shift time = 0.611 total time = 0.904 result=56000000
2x Push then shift from Collections\Deque:   n=       8 iterations= 1000000
=> max memory=     208 bytes, create+destroy time=0.387 shift time = 0.471 total time = 0.859 result=56000000
2x Push then shift from Ds\Deque:            n=       8 iterations= 1000000
=> max memory=     216 bytes, create+destroy time=0.410 shift time = 0.486 total time = 0.896 result=56000000
2x Push then shift from SplDoublyLinkedList: n=       8 iterations= 1000000
=> max memory=     408 bytes, create+destroy time=0.784 shift time = 0.470 total time = 1.253 result=56000000
 
2x Push then shift from array:               n=    1024 iterations=   20000
=> max memory=   41016 bytes, create+destroy time=0.430 shift time = 1.380 total time = 1.810 result=20951040000
2x Push then shift from Collections\Deque:   n=    1024 iterations=   20000
=> max memory=   16464 bytes, create+destroy time=0.428 shift time = 0.902 total time = 1.329 result=20951040000
2x Push then shift from Ds\Deque:            n=    1024 iterations=   20000
=> max memory=   16472 bytes, create+destroy time=0.494 shift time = 1.102 total time = 1.596 result=20951040000
2x Push then shift from SplDoublyLinkedList: n=    1024 iterations=   20000
=> max memory=   32920 bytes, create+destroy time=1.504 shift time = 0.959 total time = 2.463 result=20951040000
 
2x Push then shift from array:               n= 1048576 iterations=      20
=> max memory=41943120 bytes, create+destroy time=0.889 shift time = 1.420 total time = 2.309 result=21990211584000
2x Push then shift from Collections\Deque:   n= 1048576 iterations=      20
=> max memory=16777320 bytes, create+destroy time=0.856 shift time = 1.017 total time = 1.873 result=21990211584000
2x Push then shift from Ds\Deque:            n= 1048576 iterations=      20
=> max memory=16777328 bytes, create+destroy time=0.889 shift time = 1.023 total time = 1.913 result=21990211584000
2x Push then shift from SplDoublyLinkedList: n= 1048576 iterations=      20
=> max memory=33554584 bytes, create+destroy time=1.572 shift time = 1.006 total time = 2.578 result=21990211584000

Only appending to a Deque and reading elements without removal

Note that the proposed Deque as well as the existing SplDoublyLinkedList/SplStack are expected to perform equally well at shifting to (adding to) or unshifting from(removing from) the start(first elements) of an array. (Shifting or unshifting (or unsetting) the start is inefficient for arrays)

For this benchmark, Deque can be created and read from faster than the fastest methods of reading SplStack or SplFixedArray.

  • Note that a foreach is used instead of random access for SplStack, since SplStack random access time is proportional to the number of linked list nodes that need to be iterated over to read a value.
  • SplFixedArray would be faster to append to if it had a push() method, but there isn't one.

Major notes:

  • Collections\Deque and SplFixedArray use less memory than other available options.
  • Collections\Deque is faster than object data structures currently available in the SPL.
  • In php 8.2+, array is faster than all object data structures, and memory usage of large packed arrays is half of what would be used in 8.1 due to https://github.com/php/php-src/pull/7491
Results for php 8.2.0-dev debug=false with opcache enabled=true
 
Appending to array:         n=       1 iterations= 8000000 memory=     216 bytes, create+destroy time=0.612 read time = 0.303 result=0
Appending to Deque:         n=       1 iterations= 8000000 memory=     144 bytes, create+destroy time=1.009 read time = 0.365 result=0
Appending to SplStack:      n=       1 iterations= 8000000 memory=     184 bytes, create+destroy time=1.735 read time = 0.725 result=0
Appending to SplFixedArray: n=       1 iterations= 8000000 memory=      80 bytes, create+destroy time=1.816 read time = 0.438 result=0
 
 
Appending to array:         n=       4 iterations= 2000000 memory=     216 bytes, create+destroy time=0.219 read time = 0.119 result=12000000
Appending to Deque:         n=       4 iterations= 2000000 memory=     144 bytes, create+destroy time=0.340 read time = 0.176 result=12000000
Appending to SplStack:      n=       4 iterations= 2000000 memory=     280 bytes, create+destroy time=0.732 read time = 0.295 result=12000000
Appending to SplFixedArray: n=       4 iterations= 2000000 memory=     128 bytes, create+destroy time=1.186 read time = 0.246 result=12000000
 
 
Appending to array:         n=       8 iterations= 1000000 memory=     216 bytes, create+destroy time=0.147 read time = 0.086 result=28000000
Appending to Deque:         n=       8 iterations= 1000000 memory=     208 bytes, create+destroy time=0.244 read time = 0.163 result=28000000
Appending to SplStack:      n=       8 iterations= 1000000 memory=     408 bytes, create+destroy time=0.529 read time = 0.235 result=28000000
Appending to SplFixedArray: n=       8 iterations= 1000000 memory=     192 bytes, create+destroy time=1.057 read time = 0.222 result=28000000
 
 
Appending to array:         n= 1048576 iterations=      20 memory=16781392 bytes, create+destroy time=0.437 read time = 0.157 result=10995105792000
Appending to Deque:         n= 1048576 iterations=      20 memory=16777320 bytes, create+destroy time=0.528 read time = 0.305 result=10995105792000
Appending to SplStack:      n= 1048576 iterations=      20 memory=33554584 bytes, create+destroy time=0.884 read time = 0.401 result=10995105792000
Appending to SplFixedArray: n= 1048576 iterations=      20 memory=16777304 bytes, create+destroy time=2.575 read time = 0.428 result=10995105792000

Future Scope

If Collections\Deque is added, there would be plenty of time for myself or others to propose and discuss additional methods before PHP 8.2's feature freeze (probably in July 2022) (e.g. insert(int $offset, mixed ...$values), remove(int $offset))

Future additions to https://github.com/TysonAndre/pecl-teds that are general purpose enough such as hash sets/maps and sorted sets/maps and vectors may be possible as well.

The php-ds maintainers are also open to borrowing from the implementation of php-ds to implement new datastructures.

My position currently is that I think we should start from scratch and borrow whatever is good from php-ds that we want to keep to implement something natively in core. In the 4 or 5 years since I wrote this extension I've been studying persistent data structures in-depth and there are a lot of decisions that I made then that I would do differently now. I would like to be a part of the design and implementation of the data structures themselves, but I do not have the understanding or capacity to be involved in work relating to the engine or its integration. It seems unlikely that a 2.0 of this extension will come about, I'm not convinced that a complete rework would be a good investment of our time.

Proposed Voting Choices

Yes/No vote, requiring a 2/3 majority

References

Rejected Features

Why not use the php-ds/ext-ds PECL instead (as an end user)?

  1. No matter how useful or popular a PECL is, datastructures available in PHP's core will have much, much wider adoption in applications and libraries that are available in PECLs, allowing those applications and libraries to write faster and/or more memory efficient code.
  2. End users can make much stronger assumptions about the backwards compatibility and long-term availability of data structures that are included in core.
  3. The php-ds maintainers do not plan to merge the extension into php-src, and believe php-ds should coexist with new functionality being added in a separate namespace instead (see quote and later clarifications for full context)
  4. Opcache may be able to make stronger optimizations of internal classes found in php-src than any third party PECL. (e.g. because Collections\Deque::push() or Collections\Vector::push() would never throw or emit notices, it may be possible to optimize it to be even faster than appending to an array in the Opcache JIT compiler)

php-ds maintainer response

This has been asked about multiple times in threads on unrelated proposals (https://externals.io/message/112639#112641 and https://externals.io/message/93301#93301 years ago) throughout the years.

The php-ds's maintainer's position is that “we should start from scratch and borrow whatever is good from php-ds that we want to keep to implement something natively in core.”

On September 24, 2021, https://github.com/php-ds/ext-ds/issues/156#issuecomment-926353779 clarified their position

Hi everyone, I am happy to see this discussion and I thank you all for taking part. My reservation to merge ds into core has always been because I wanted to make sure we get it right before we do that and the intention behind the mythical v2 was to achieve that, based on learnings from v1 and feedback from the community. I have no personal attachment to this project, I only want what is best for PHP and the community.

I would love to see a dedicated, super-lean vec data structure in core that has native iteration and all the other same internal benefits as arrays. In my opinion, the API should be very minimal and potentially compatible with all the non-assoc array functions. An OO interface can easily be designed around that. I'm imagining something similar to Golang's slices.

As for the future of ds itself, I think these can co-exist and ds can remain external. I've been researching and designing immutable data structures over the last 4 years and I still hope to develop a v2 that simplifies the interfaces and introduces immutable structures. Attempting to implement a suite of structures in core or an OO vector would take a lot of work and might be difficult to reach consensus on with the API. I don't think we should attempt to merge ds into core at any time.

I am currently traveling and have not followed this discussion in detail on the mailing list. I'd be happy to assist in any way I can and will catch up as soon as I am home again this week. Feel free to quote this response on the mailing list as well.

On December 13, 2021, https://github.com/php-ds/ext-ds/issues/156#issuecomment-992773289

My position currently is that I think we should start from scratch and borrow whatever is good from php-ds that we want to keep to implement something natively in core. In the 4 or 5 years since I wrote this extension I've been studying persistent data structures in-depth and there are a lot of decisions that I made then that I would do differently now. I would like to be a part of the design and implementation of the data structures themselves, but I do not have the understanding or capacity to be involved in work relating to the engine or its integration. It seems unlikely that a 2.0 of this extension will come about, I'm not convinced that a complete rework would be a good investment of our time.

Minor differences in API design goals

Traditionally, PHP has been a very batteries included language. Existing functionality such as strings and arrays have very large standard libraries. This makes it easier to write code without depending on too many third party composer libraries, and knowledge of the standard library can transfer to any codebase a developer works on.

My hopes for ease of use, readability, speed, and static analysis in future data structures such as Vector are similar to those mentioned by Benjamin Morel in the GitHub issue:

Functions like map, filter, reduce can be delegated to other libraries that operate on iterable instead of having these as first-class members of the interface.

Again, I understand the rationale behind this decision, like reducing duplication and keeping only the core functionality in DS. However, sometimes you have to take into consideration ease of use vs purity of the code.

Ease of use / DX / readability: it seems more logical to me to do:

$map->filter(fn(...) => ...);

Rather than:

Some\filter($map, fn(...) => ...);

Speed: as you said, internal iteration is faster. And speed is one of the selling points of DS vs arrays.

Static analysis: I love the fact that Map::filter() can be strictly typed as returning Map<TKey, TValue> in Psalm, for example. If you rely on a generic filter() function, I'm not sure such strict typing will be easy or even possible.

Thank you for your work on DS anyway, I already use the extension in my closed-source project, in particular Map. I would love to use data structures in my open-source projects, one day! 🤞

Additionally, it may be inconvenient for end users (e.g. new contributors to projects) to remember specifics of multiple libraries or utility classes when working on different codebases, to deal with dependency conflicts after major version upgrades, or to deal with libraries dropping support for older php versions, getting abandoned, etc.

Discussion

Request: Returning $this instead of void (in php's internal classes, in general)

The Deque::shift() and Deque::push() methods currently return void. Larry Garfield suggests in https://externals.io/message/116100#116967

Request:

push() and unshift() currently return void. That's not helpful. It would be vastly more useful if they both returned $this. Not as much for chaining, but more so that you can add a value to a queue and pass it as an argument to another call (often recursive, but not necessarily) in a single operation.

Example: I was doing last year's Advent of Code in functional PHP, and had a stack walker that looked like this:

function parse(string $line, $pos = 0, array $stack = []): Result|string
{
    $next = $line[$pos] ?? null;
    $head = $stack[0] ?? null;
 
    return match ($next) {
        // Opening brace, push an "expected" onto the stack.
        '{' => parse($line, $pos + 1, ['}', ...$stack]),
        '<' => parse($line, $pos + 1, ['>', ...$stack]),
        '(' => parse($line, $pos + 1, [')', ...$stack]),
        '[' => parse($line, $pos + 1, [']', ...$stack]),
        '}', '>', ')', ']' => $next === $head ? parse($line, $pos + 1, array_slice($stack, 1)) : $next,
        null => count($stack) ? Result::Incomplete : Result::OK,
    };
}

The interesting part is the ['<', ...$stack], to pass on a modified version of an array-as-stack. That's of course annoyingly slow with arrays right now, and a Deque would be better, but only if it could be “modified and passed” like that. If not, it would be incompatible with single-expression usages (match statements, short lambdas, etc.)

Returning $this would resolve that, I think. (Making it return a new, immutable copy of the Deque would be even nicer, but I realize that's probably not an argument I'm going to win at this point on this RFC.)

My personal preference is against making this fluent and continuing to use void like we're already doing for other methods. I'd rather expose an efficient datastructure that's consistent with the rest of PHP's functionality to the extent possible, which userland can use to write their own fluent/non-fluent classes. There's drawbacks to returning $this, including:

Also, this isn't a request specifically about Collections\Deque or collections in general, but rather about the general design of classes in php

1. Inconsistency with existing APIs making remembering what does what harder. Barely anything in php that I remember returns $this.

https://www.php.net/manual/en/arrayobject.append.php returns void.

https://www.php.net/manual/en/function.array-push returns an int.

2. Inconsistency with possible new datastructures/methods

If a Map/Set function were to be added, then methods for add/remove would return booleans (or the old value), not $this

3. Slight additional performance overhead for functionality I assume will be used relatively infrequently

(php has to increment reference counts and opcache can't eliminate the opcode to decrease reference counts and possibly free the return value of $deque->unshift()/push() with the return type info being an object)

4. Returning $this makes code easier to write at some cost to readability (for users new to PHP or this class, though it may help readability to developers deeply familiar with parts of standard library using it) - Developers new to php or using Collections\Deque for the first time would not immediately know what the code they're reading is doing. (less of a problem with a good IDE, typechecker, and a typed codebase, but this isn't universal) Having it return void, return $deque->push() would be less common and this would force the meaning to be clear.

Developers might have several guesses/assumptions based on their experience with other methods in php/elsewhere

  • It returns the new count (JavaScript Array.push, array_push)
  • It returns $this (Ruby)
  • It returns a lazy copy, like you'd wanted, not modifying the original
  • It's returning void and the code in question is shorthand for return null.

(Python, C++ https://www.cplusplus.com/reference/vector/vector/push_back/ , offsetSet and spl push()/unshift() methods)

Adding RETURN_OBJ_COPY(Z_OBJ_P(ZEND_THIS)); and a different method with the : Deque return type, fluent(push+return $this) was around 6-8% slower for me for push operations than push (returning void) (aside: method calls in php have a high overhead, the engine is better at optimizing []= if specialized handlers exist. Benchmarks were run twice to give an better idea of how much of the difference was random.):

5. I won't know if anyone is strongly against fluent interfaces belonging in core, or strongly against adopting fluent interfaces 1)

1)
'return $this) in an ad-hoc way. Changing it here would make it unclear if the vote was for/against the overall functionality or the new design choice of return $this;.
I'd rather not do this without evidence this is widely considered a better design by the community (e.g. the change to a namespace had a policy RFC https://wiki.php.net/rfc/namespaces_in_bundled_extensions). 6. If we decide we want to change the return type of a method (in general, not for push/unshift), then it's less of a breaking change to change it from
void (where less code would use the return value) than from $this (where there may be existing code such as $obj->method()->otherMethod()). <code> Results for php 8.2.0-dev debug=false with opcache enabled=true Appending to Collections\Deque []= : n= 4 iterations=10000000, create+destroy time=1.204 result=0 Appending to Collections\Deque push : n= 4 iterations=10000000, create+destroy time=1.551 result=0 Appending to Collections\Deque fluent: n= 4 iterations=10000000, create+destroy time=1.640 result=0 Appending to Collections\Deque []= : n= 4 iterations=10000000, create+destroy time=1.213 result=0 Appending to Collections\Deque push : n= 4 iterations=10000000, create+destroy time=1.549 result=0 Appending to Collections\Deque fluent: n= 4 iterations=10000000, create+destroy time=1.636 result=0 Appending to Collections\Deque []= : n= 8 iterations= 5000000, create+destroy time=0.923 result=0 Appending to Collections\Deque push : n= 8 iterations= 5000000, create+destroy time=1.263 result=0 Appending to Collections\Deque fluent: n= 8 iterations= 5000000, create+destroy time=1.335 result=0 Appending to Collections\Deque []= : n= 8 iterations= 5000000, create+destroy time=0.921 result=0 Appending to Collections\Deque push : n= 8 iterations= 5000000, create+destroy time=1.261 result=0 Appending to Collections\Deque fluent: n= 8 iterations= 5000000, create+destroy time=1.335 result=0 Appending to Collections\Deque []= : n= 1024 iterations= 100000, create+destroy time=1.246 result=0 Appending to Collections\Deque push : n= 1024 iterations= 100000, create+destroy time=1.972 result=0 Appending to Collections\Deque fluent: n= 1024 iterations= 100000, create+destroy time=2.150 result=0 Appending to Collections\Deque []= : n= 1024 iterations= 100000, create+destroy time=1.251 result=0 Appending to Collections\Deque push : n= 1024 iterations= 100000, create+destroy time=1.969 result=0 Appending to Collections\Deque fluent: n= 1024 iterations= 100000, create+destroy time=2.146 result=0 Appending to Collections\Deque []= : n= 1048576 iterations= 100, create+destroy time=2.450 result=0 Appending to Collections\Deque push : n= 1048576 iterations= 100, create+destroy time=3.172 result=0 Appending to Collections\Deque fluent: n= 1048576 iterations= 100, create+destroy time=3.371 result=0 Appending to Collections\Deque []= : n= 1048576 iterations= 100, create+destroy time=2.455 result=0 Appending to Collections\Deque push : n= 1048576 iterations= 100, create+destroy time=3.183 result=0 Appending to Collections\Deque fluent: n= 1048576 iterations= 100, create+destroy time=3.376 result=0 </code> ===== Appendix ===== ==== Benchmark Source Code ==== === Benchmarking repeated push and shift operations === <code php> <?php const PUSH_SHIFT_CYCLES = 2; function bench_array(int $n, int $iterations) { $totalReadTime = 0.0; $startTime = hrtime(true); $total = 0; for ($j = 0; $j < $iterations; $j++) { $startMemory = memory_get_usage(); $values = []; for ($times = 0; $times < PUSH_SHIFT_CYCLES; $times++) { for ($i = 0; $i < $n; $i++) { $values[] = $i; } $maxMemory = memory_get_usage(); $startReadTime = hrtime(true); while (count($values) > 0) { Pretend we don't know the actual position of the first array key for this simulated benchmark array_shift is not used because it is linear time (to move all other elements) rather than constant time. This approach is simple at the cost of memory - it converts a packed array to a non-packed array NOTE: Adding a call to reset() is not necessary in this case, and would result in worse performance. NOTE: array_key_first results in quadratic performance for this synthetic benchmark. $key = array_key_first($values); $key = key($values); $total += $values[$key]; unset($values[$key]); } $endReadTime = hrtime(true); $totalReadTime += $endReadTime - $startReadTime; } unset($values); } $endTime = hrtime(true); $totalTime = ($endTime - $startTime) / 1000000000; $totalReadTimeSeconds = $totalReadTime / 1000000000; printf(“2x Push then shift from array: n=%8d iterations=%8d\n=> max memory=%8d bytes, create+destroy time=%.3f shift time = %.3f total time = %.3f result=%d\n”, $n, $iterations, $maxMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $totalTime, $total); } function bench_deque(int $n, int $iterations) { $startTime = hrtime(true); $totalReadTime = 0.0; $total = 0; for ($j = 0; $j < $iterations; $j++) { $startMemory = memory_get_usage(); $values = new Collections\Deque(); for ($times = 0; $times < PUSH_SHIFT_CYCLES; $times++) { for ($i = 0; $i < $n; $i++) { $values[] = $i; } $maxMemory = memory_get_usage(); $startReadTime = hrtime(true); while (count($values) > 0) { $total += $values->shift(); } $endReadTime = hrtime(true); $totalReadTime += $endReadTime - $startReadTime; } unset($values); } $endTime = hrtime(true); $totalTime = ($endTime - $startTime) / 1000000000; $totalReadTimeSeconds = $totalReadTime / 1000000000; printf(“2x Push then shift from Collections\Deque: n=%8d iterations=%8d\n=> max memory=%8d bytes, create+destroy time=%.3f shift time = %.3f total time = %.3f result=%d\n”, $n, $iterations, $maxMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $totalTime, $total); } function bench_ds_deque(int $n, int $iterations) { $startTime = hrtime(true); $totalReadTime = 0.0; $total = 0; for ($j = 0; $j < $iterations; $j++) { $startMemory = memory_get_usage(); $values = new Ds\Deque(); for ($times = 0; $times < PUSH_SHIFT_CYCLES; $times++) { for ($i = 0; $i < $n; $i++) { $values[] = $i; } $maxMemory = memory_get_usage(); $startReadTime = hrtime(true); while (count($values) > 0) { $total += $values->shift(); } $endReadTime = hrtime(true); $totalReadTime += $endReadTime - $startReadTime; } unset($values); } $endTime = hrtime(true); $totalTime = ($endTime - $startTime) / 1000000000; $totalReadTimeSeconds = $totalReadTime / 1000000000; printf(“2x Push then shift from Ds\Deque: n=%8d iterations=%8d\n=> max memory=%8d bytes, create+destroy time=%.3f shift time = %.3f total time = %.3f result=%d\n”, $n, $iterations, $maxMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $totalTime, $total); } SplDoublyLinkedList is a linked list that takes more memory than needed. Access to values in the middle of the SplDoublyLinkedList is also less efficient. function bench_spl_dll(int $n, int $iterations) { $startTime = hrtime(true); $totalReadTime = 0.0; $total = 0; for ($j = 0; $j < $iterations; $j++) { $startMemory = memory_get_usage(); $values = new SplDoublyLinkedList(); for ($times = 0; $times < PUSH_SHIFT_CYCLES; $times++) { for ($i = 0; $i < $n; $i++) { $values->push($i); } $maxMemory = memory_get_usage(); $startReadTime = hrtime(true); Random access is linear time in a linked list, so use foreach instead while (count($values) > 0) { $total += $values->shift(); } $endReadTime = hrtime(true); $totalReadTime += $endReadTime - $startReadTime; } unset($values); } $endTime = hrtime(true); $totalTime = ($endTime - $startTime) / 1000000000; $totalReadTimeSeconds = $totalReadTime / 1000000000; printf(“2x Push then shift from SplDoublyLinkedList: n=%8d iterations=%8d\n=> max memory=%8d bytes, create+destroy time=%.3f shift time = %.3f total time = %.3f result=%d\n”, $n, $iterations, $maxMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $totalTime, $total); } $n = 220; $iterations = 10; $sizes = [ [1, 8000000], [4, 2000000], [8, 1000000], [1024, 20000], [220, 20], ]; echo “Test creating a collection, then two cycles of push+shift (adding n values to the end of collections then shifting all of them from the start of the collection)\n”; printf( “Results for php %s debug=%s with opcache enabled=%s\n\n”, PHP_VERSION, PHP_DEBUG ? 'true' : 'false', json_encode(function_exists('opcache_get_status') && (opcache_get_status(false)['opcache_enabled'] ?? false)) ); foreach ($sizes as [$n, $iterations]) { bench_array($n, $iterations); bench_deque($n, $iterations); bench_ds_deque($n, $iterations); bench_spl_dll($n, $iterations); echo “\n”; } </code> === Benchmarking only appending to a Deque and reading elements without removal === <code php> <?php function bench_array(int $n, int $iterations) { $totalReadTime = 0.0; $startTime = hrtime(true); $total = 0; for ($j = 0; $j < $iterations; $j++) { $startMemory = memory_get_usage(); $values = []; for ($i = 0; $i < $n; $i++) { $values[] = $i; } $startReadTime = hrtime(true); for ($i = 0; $i < $n; $i++) { $total += $values[$i]; } $endReadTime = hrtime(true); $totalReadTime += $endReadTime - $startReadTime; $endMemory = memory_get_usage(); unset($values); } $endTime = hrtime(true); $totalTime = ($endTime - $startTime) / 1000000000; $totalReadTimeSeconds = $totalReadTime / 1000000000; printf(“Appending to array: n=%8d iterations=%8d memory=%8d bytes, create+destroy time=%.3f read time = %.3f result=%d\n”, $n, $iterations, $endMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $total); } function bench_deque(int $n, int $iterations) { $startTime = hrtime(true); $totalReadTime = 0.0; $total = 0; for ($j = 0; $j < $iterations; $j++) { $startMemory = memory_get_usage(); $values = new Collections\Deque(); for ($i = 0; $i < $n; $i++) { $values[] = $i; } $startReadTime = hrtime(true); for ($i = 0; $i < $n; $i++) { $total += $values[$i]; } $endReadTime = hrtime(true); $totalReadTime += $endReadTime - $startReadTime; $endMemory = memory_get_usage(); unset($values); } $endTime = hrtime(true); $totalTime = ($endTime - $startTime) / 1000000000; $totalReadTimeSeconds = $totalReadTime / 1000000000; printf(“Appending to Collections\Deque: n=%8d iterations=%8d memory=%8d bytes, create+destroy time=%.3f read time = %.3f result=%d\n”, $n, $iterations, $endMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $total); } function bench_ds_deque(int $n, int $iterations) { $startTime = hrtime(true); $totalReadTime = 0.0; $total = 0; for ($j = 0; $j < $iterations; $j++) { $startMemory = memory_get_usage(); $values = new Ds\Deque(); for ($i = 0; $i < $n; $i++) { $values[] = $i; } $startReadTime = hrtime(true); for ($i = 0; $i < $n; $i++) { $total += $values[$i]; } $endReadTime = hrtime(true); $totalReadTime += $endReadTime - $startReadTime; $endMemory = memory_get_usage(); unset($values); } $endTime = hrtime(true); $totalTime = ($endTime - $startTime) / 1000000000; $totalReadTimeSeconds = $totalReadTime / 1000000000; printf(“Appending to Ds\Deque: n=%8d iterations=%8d memory=%8d bytes, create+destroy time=%.3f read time = %.3f result=%d\n”, $n, $iterations, $endMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $total); } SplStack is a subclass of SplDoublyLinkedList, so it is a linked list that takes more memory than needed. Access to values in the middle of the SplStack is also less efficient. function bench_spl_stack(int $n, int $iterations) { $startTime = hrtime(true); $totalReadTime = 0.0; $total = 0; for ($j = 0; $j < $iterations; $j++) { $startMemory = memory_get_usage(); $values = new SplStack(); for ($i = 0; $i < $n; $i++) { $values->push($i); } $startReadTime = hrtime(true); Random access is linear time in a linked list, so use foreach instead foreach ($values as $value) { $total += $value; } $endReadTime = hrtime(true); $totalReadTime += $endReadTime - $startReadTime; $endMemory = memory_get_usage(); unset($values); } $endTime = hrtime(true); $totalTime = ($endTime - $startTime) / 1000000000; $totalReadTimeSeconds = $totalReadTime / 1000000000; printf(“Appending to SplStack: n=%8d iterations=%8d memory=%8d bytes, create+destroy time=%.3f read time = %.3f result=%d\n”, $n, $iterations, $endMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $total); } function bench_spl_fixed_array(int $n, int $iterations) { $startTime = hrtime(true); $totalReadTime = 0.0; $total = 0; for ($j = 0; $j < $iterations; $j++) { $startMemory = memory_get_usage(); $values = new SplFixedArray(); for ($i = 0; $i < $n; $i++) { Imitate how push() would be implemented in a situation where the number of elements wasn't actually known ahead of time. erealloc() tends to extend the existing array when possible. $size = $values->getSize(); $values->setSize($size + 1); $values->offsetSet($size, $i); } $startReadTime = hrtime(true); for ($i = 0; $i < $n; $i++) { $total += $values[$i]; } $endReadTime = hrtime(true); $totalReadTime += $endReadTime - $startReadTime; $endMemory = memory_get_usage(); unset($values); } $endTime = hrtime(true); $totalTime = ($endTime - $startTime) / 1000000000; $totalReadTimeSeconds = $totalReadTime / 1000000000; printf(“Appending to SplFixedArray: n=%8d iterations=%8d memory=%8d bytes, create+destroy time=%.3f read time = %.3f result=%d\n\n”, $n, $iterations, $endMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $total); } $n = 220; $iterations = 10; $sizes = [ [1, 8000000], [4, 2000000], [8, 1000000], [220, 20], ]; printf( “Results for php %s debug=%s with opcache enabled=%s\n\n”, PHP_VERSION, PHP_DEBUG ? 'true' : 'false', json_encode(function_exists('opcache_get_status') && (opcache_get_status(false)['opcache_enabled'] ?? false)) ); foreach ($sizes as [$n, $iterations]) { bench_array($n, $iterations); bench_deque($n, $iterations); bench_ds_deque($n, $iterations); bench_spl_stack($n, $iterations); bench_spl_fixed_array($n, $iterations); echo “\n”; } </code> === Benchmark adding to the front then removing from the front === <code php> <?php const PUSH_SHIFT_CYCLES = 2; function bench_array(int $n, int $iterations) { $totalReadTime = 0.0; $startTime = hrtime(true); $total = 0; if ($n > 1024) { echo “ Skip array n=$n iterations=$iterations quadratic time too slow \n”; return; } for ($j = 0; $j < $iterations; $j++) { $startMemory = memory_get_usage(); $values = []; for ($times = 0; $times < PUSH_SHIFT_CYCLES; $times++) { for ($i = 0; $i < $n; $i++) { array_unshift($values, $i); } $maxMemory = memory_get_usage(); $startReadTime = hrtime(true); while (count($values) > 0) { $total += array_shift($values); } $endReadTime = hrtime(true); $totalReadTime += $endReadTime - $startReadTime; } unset($values); } $endTime = hrtime(true); $totalTime = ($endTime - $startTime) / 1000000000; $totalReadTimeSeconds = $totalReadTime / 1000000000; printf(“2x Unshift then shift from start of array: n=%8d iterations=%8d\n=> max memory=%8d bytes, create+destroy time=%.3f shift time = %.3f total time = %.3f result=%d\n”, $n, $iterations, $maxMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $totalTime, $total); } function bench_deque(int $n, int $iterations) { $startTime = hrtime(true); $totalReadTime = 0.0; $total = 0; for ($j = 0; $j < $iterations; $j++) { $startMemory = memory_get_usage(); $values = new Collections\Deque(); for ($times = 0; $times < PUSH_SHIFT_CYCLES; $times++) { for ($i = 0; $i < $n; $i++) { $values[] = $i; } $maxMemory = memory_get_usage(); $startReadTime = hrtime(true); while (count($values) > 0) { $total += $values->shift(); } $endReadTime = hrtime(true); $totalReadTime += $endReadTime - $startReadTime; } unset($values); } $endTime = hrtime(true); $totalTime = ($endTime - $startTime) / 1000000000; $totalReadTimeSeconds = $totalReadTime / 1000000000; printf(“2x Unshift then shift from start of Collections\Deque: n=%8d iterations=%8d\n=> max memory=%8d bytes, create+destroy time=%.3f shift time = %.3f total time = %.3f result=%d\n”, $n, $iterations, $maxMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $totalTime, $total); } function bench_ds_deque(int $n, int $iterations) { $startTime = hrtime(true); $totalReadTime = 0.0; $total = 0; for ($j = 0; $j < $iterations; $j++) { $startMemory = memory_get_usage(); $values = new Ds\Deque(); for ($times = 0; $times < PUSH_SHIFT_CYCLES; $times++) { for ($i = 0; $i < $n; $i++) { $values->unshift($i); } $maxMemory = memory_get_usage(); $startReadTime = hrtime(true); while (count($values) > 0) { $total += $values->shift(); } $endReadTime = hrtime(true); $totalReadTime += $endReadTime - $startReadTime; } unset($values); } $endTime = hrtime(true); $totalTime = ($endTime - $startTime) / 1000000000; $totalReadTimeSeconds = $totalReadTime / 1000000000; printf(“2x Unshift then shift from start of Ds\Deque: n=%8d iterations=%8d\n=> max memory=%8d bytes, create+destroy time=%.3f shift time = %.3f total time = %.3f result=%d\n”, $n, $iterations, $maxMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $totalTime, $total); } SplDoublyLinkedList is a linked list that takes more memory than needed. Access to values in the middle of the SplDoublyLinkedList is also less efficient. function bench_spl_dll(int $n, int $iterations) { $startTime = hrtime(true); $totalReadTime = 0.0; $total = 0; for ($j = 0; $j < $iterations; $j++) { $startMemory = memory_get_usage(); $values = new SplDoublyLinkedList(); for ($times = 0; $times < PUSH_SHIFT_CYCLES; $times++) { for ($i = 0; $i < $n; $i++) { $values->unshift($i); } $maxMemory = memory_get_usage(); $startReadTime = hrtime(true); Random access is linear time in a linked list, so use foreach instead while (count($values) > 0) { $total += $values->shift(); } $endReadTime = hrtime(true); $totalReadTime += $endReadTime - $startReadTime; } unset($values); } $endTime = hrtime(true); $totalTime = ($endTime - $startTime) / 1000000000; $totalReadTimeSeconds = $totalReadTime / 1000000000; printf(“2x Unshift then shift from start of SplDoublyLinkedList: n=%8d iterations=%8d\n=> max memory=%8d bytes, create+destroy time=%.3f shift time = %.3f total time = %.3f result=%d\n”, $n, $iterations, $maxMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $totalTime, $total); } $n = 220; $iterations = 10; $sizes = [ [1, 8000000], [4, 2000000], [8, 1000000], [64, 100000], [128, 50000], [1024, 2000], [220, 20], ]; echo “Test creating a collection, then two cycles of push+shift (unshifting n values to the start of collections then shifting all of them from front of collection)\n”; echo “Moving the front of an array requires moving all elements to their new position, so the total time for array is quadratic\n”; printf( “Results for php %s debug=%s with opcache enabled=%s\n\n”, PHP_VERSION, PHP_DEBUG ? 'true' : 'false', json_encode(function_exists('opcache_get_status') && (opcache_get_status(false)['opcache_enabled'] ?? false)) ); foreach ($sizes as [$n, $iterations]) { bench_array($n, $iterations); bench_deque($n, $iterations); bench_ds_deque($n, $iterations); bench_spl_dll($n, $iterations); echo “\n”; } </code> ===== Changelog ===== * 0.2: Link to definition of Deque. * 0.3: Add php-ds maintainer's response clarifications * 0.4: Update benchmarks after https://github.com/php/php-src/pull/7491 memory and time optimizations for packed arrays were merged into PHP 8.2 * 0.5: Change proposed iteration behavior to iterate over the original Deque rather than a copy, having the Deque account for shift/unshift modifying the position of the start of the array (to avoid iterating over elements too many/few times if the front of the internal representation changes position). Update benchmark result. Update responses from php-ds maintainers. * 0.6: Rename to Collections\Deque''
  • 0.6.1: Update section for rationale on namespace choice. Link to WebAssembly demo.
  • 0.6.2: Update benchmarks code/results after iteration implementation change in 0.5.
  • 0.7.0: Remove get()/set() helpers due to overlap with offsetGet/offsetSet functionality - introduction of typed get()/set() was suggested as better suited for a separate proposal for all data structures.
  • 0.8.0: Change bottom()/top() to first()/last()
rfc/deque.1643984508.txt.gz · Last modified: 2022/02/04 14:21 by tandre