rfc:deque

This is an old revision of the document!


PHP RFC: final class Collections\Deque

Introduction

While SplDoublyLinkedList and its subclass SplQueue/SplStack exist in the SPL, they have several drawbacks (that are addressed by this RFC to add a Deque class):

  1. SplDoublyLinkedList is internally represented by a doubly linked list, making it use roughly twice as much memory as the proposed Deque
  2. push/pop/unshift/shift from SplDoublyLinkedList are slower due to needing to allocate or free the linked list nodes.
  3. Reading values in the middle of the SplDoublyLinkedList is proportional to the length of the list, due to needing to traverse the linked list nodes.
  4. foreach iteration behavior cannot be understood without knowing what most recently constructed the SplDoublyLinkedList instance or set the flags.

It would be useful to have an efficient Deque container in the standard library to provide an alternative without those drawbacks, as well as for the following reasons:

  1. To provide a better alternative to SplDoublyLinkedList, SplStack, and SplQueue for use cases that require stacks or queues.
  2. As a more efficient option than array and SplDoublyLinkedList as a queue or Deque, for workflows requiring shift or unshift.
  3. To be able to reclaim memory in applications that run for long periods of time. Notably, PHP's array type will never release allocated capacity.

A Deque is more efficient than an array when used as a queue, more readable, and easier to use correctly. While it is possible to efficiently remove elements from the start of an array (in terms of insertion order), it is very inefficient to prepend elements to the start of a large array due to needing to either copy the array or move all elements in the internal array representation, and an array would use much more memory than a Deque when used that way (and be slower).

There are also several pitfalls to using an array as a queue for larger queue sizes, some of which are not obvious and discovered while writing the benchmarks. (Having a better (double-ended) queue datastructure (Collections\Deque) than the SplDoublyLinkedList would save users from needing to write code with these pitfalls):

  1. array_key_first() takes time proportional to the number of elements unset from the start of an array, causing it to unexpectedly be extremely slow (quadratic time) after unsetting many elements at the start of the queue. (when the array infrequently runs out of capacity, buckets are moved to the front)
  2. reset() or end() will convert a variable to a reference, and php is significantly less efficient at reading or writing to reference. Opcache is also less efficient at optimizing uses of variables using references.
  3. More obviously, array_unshift and array_shift will take time proportional to the number of elements in the array (to reindex and move existing/remaining elements).

Proposal

This proposes adding the class final class Collections\Deque to PHP. From the Wikipedia article for the Double-Ended Queue:

In computer science, a double-ended queue (abbreviated to deque, pronounced deck, like “cheque”) is an abstract data type that generalizes a queue, for which elements can be added to or removed from either the front (head) or back (tail).

(omitted sections, see article)

The dynamic array approach uses a variant of a dynamic array that can grow from both ends, sometimes called array deques. These array deques have all the properties of a dynamic array, such as constant-time random access, good locality of reference, and inefficient insertion/removal in the middle, with the addition of amortized constant-time insertion/removal at both ends, instead of just one end. Three common implementations include:

  1. Storing deque contents in a circular buffer, and only resizing when the buffer becomes full. This decreases the frequency of resizings.
  2. (omitted, see article)

Similarly to deques in other languages (such as Python collections.deque, C++ std::deque, Rust std::collections::VecDeque, etc.) this is backed by a memory-efficient representation with good cache locality and provides constant amortized-time push/pop/shift/unshift operations. (this RFC's implementation is a circular buffer represented as a raw C array of values with a size, offset, and capacity)

This reclaims memory that is no longer used (when less than a quarter of the capacity is used), unlike array

Similarly to ArrayObject and SplDoublyLinkedList, the $x[] = $value shorthand for appending to the Deque is supported in addition to ArrayAccess functionality.

The proposed API is:

namespace Collections;
 
/**
 * A double-ended queue represented internally as a circular buffer.
 * This has much lower memory usage than SplDoublyLinkedList or its subclasses (SplStack, SplStack),
 * and operations are significantly faster than SplDoublyLinkedList.
 *
 * This supports amortized constant time pushing and popping onto the front or back of the array.
 *
 * Naming is based on https://www.php.net/spldoublylinkedlist
 * and on array_push/pop/unshift/shift.
 */
final class Deque implements IteratorAggregate, Countable, JsonSerializable, ArrayAccess
{
    /** Construct the Deque from the values of the Traversable/array, ignoring keys */
    public function __construct(iterable $iterator = []) {}
    /**
     * Returns an iterator that accounts for calls to shift/unshift tracking the position 
     * of the front of the Deque.
     * Calls to shift/unshift will do the following:
     * - Increase/Decrease the value returned by the iterator's key()
     *   by the number of elements added/removed to/from the front of the Deque.
     *   (`$deque[$iteratorKey] === $iteratorValue` at the time the key and value are returned).
     * - Repeated calls to shift will cause valid() to return false if the iterator's
     *   position ends up before the start of the Deque at the time iteration resumes.
     * - They will not cause the remaining values to be iterated over more than once or skipped.
     */
    public function getIterator(): InternalIterator {}
    /** Returns the number of elements in the Deque. */
    public function count(): int {}
    /** Returns true if there are 0 elements in the Deque. */
    public function isEmpty(): bool {}
    /** Removes all elements from the Deque. */
    public function clear(): void {}
 
    public function __serialize(): array {}
    public function __unserialize(array $data): void {}
    /** Construct the Deque from the values of the array, ignoring keys */
    public static function __set_state(array $array): Deque {}
 
    /** Appends value(s) to the end of the Deque. */
    public function push(mixed ...$values): void {}
    /** Prepends value(s) to the start of the Deque. */
    public function unshift(mixed ...$values): void {}
    /** Pops a value from the end of the Deque. */
    public function pop(): mixed {}
    /** Shifts a value from the front of the Deque. */
    public function shift(): mixed {}
 
    /** Peeks at the value at the start of the Deque, throws if empty */
    public function bottom(): mixed {}
    /** Peeks at the value at the end of the Deque, throws if empty */
    public function top(): mixed {}
 
    /** Returns a list of the elements from front to back. */
    public function toArray(): array {}
    /* Get and set are strictly typed, unlike offsetGet/offsetSet. */
    public function get(int $offset): mixed {}
    public function set(int $offset, mixed $value): void {}
    // Must be mixed for compatibility with ArrayAccess
    public function offsetGet(mixed $offset): mixed {}
    public function offsetExists(mixed $offset): bool {}
    public function offsetSet(mixed $offset, mixed $value): void {}
    /** Throws unconditionally */
    public function offsetUnset(mixed $offset): void {}
 
    /** This is JSON serialized as a JSON array with elements from front to back */
    public function jsonSerialize(): array {}
}

Arguments for using this

What applications would benefit from Deque?

As mentioned in https://en.wikipedia.org/wiki/Queue_(abstract_data_type)

Queues provide services in computer science, transport, and operations research where various entities such as data, objects, persons, or events are stored and held to be processed later. In these contexts, the queue performs the function of a buffer. Another usage of queues is in the implementation of breadth-first search.

Some examples of how queues are used in php applications and libraries are below:

  1. Composer: https://github.com/composer/composer/blob/2.1.8/src/Composer/DependencyResolver/RuleSetGenerator.php#L157-L203 uses a SplQueue internally as a work queue in its dependency resolution logic.
  2. Event-Driven frameworks: https://github.com/reactphp/event-loop/blob/v1.2.0/src/Tick/FutureTickQueue.php uses SplQueue internally to hold callback functions to call later in the order they were added.
  3. Networking code: Predis, a userland client for the Redis database, currently uses SplQueue to build a pipeline of commands and associate pipelined outgoing requests with their incoming responses in the order the requests were sent. https://github.com/predis/predis/blob/v1.1.7/src/Pipeline/Pipeline.php#L121-L150

Once applications and libraries such as those were able to use Collections\Deque (e.g. when the minimum PHP version requirement was raised), they would benefit from the speedup and reduced memory that Collections\Deque has over SplDoublyLinkedList/SplQueue.

Looking at the benchmarks section, SplQueue(and the parent class SplDoublyLinkedList) is typically slower than array (though array has higher reported memory usage). By introducing a data structure (Deque) that's even faster and more memory usage than an array for use as a double-ended queue, even more applications would benefit from it.

  • Additionally, Deque would allow some code to be refactored or written in a more readable way in cases where php developers would previously be avoiding using SplDoublyLinkedList (or SplStack/SplQueue) due to older datastructures having time and memory usage inefficiencies.

https://www.php.net/manual/en/class.splqueue.php#116422 is another example

You can use shift/unshift and push/pop to dequeue/undequeue and queue/unqueue respectively. Really handy for those applications that use sockets where you might not know you can't send data until you attempt to.

for example, this is a function for an application that will un-dequeue the remainder of the data if socket_write indicated it did not write the entire contents of the provided data.

(omitted code snippet from https://www.php.net/manual/en/class.splqueue.php#116422)

Even users that don't write code using Deque may still benefit from its inclusion, due to improved performance and reduced memory usage in the applications and libraries that do adopt it.

Why use this instead of SplDoublyLinkedList?

  1. Pushing/popping/shifting/unshifting takes roughly 30% less time.
  2. Deque typically uses less memory than SplDoublyLinkedList.
  3. The iteration order is predictable, unlike SplDoublyLinkedList, where it can be modified with flags
  4. Constant-time access for reading or modifying elements at any position in the Deque, unlike SplDoublyLinkedList where that would require walking the linked list.

Why use this instead of array?

  1. Faster than using array for queue-like workloads (e.g. around 13%-37% less time to add to the end and remove from the front depending on Deque size and access pattern in an array implementation that correctly avoids linear-time operations).
  2. It is impossible to prepend to an array (i.e. to be first in insertion order) in constant time. array_unshift takes time proportional to the length of an array.
  3. Deque uses much less memory than an array when used as a queue, especially since that will eventually convert an array to an associative array. See https://www.npopov.com/2014/12/22/PHPs-new-hashtable-implementation.html and benchmarks. Note that starting in php 8.2, array
  4. Modifying an object by value can be faster than modifying an array by reference. Sometime, it is necessary for a library to modify a passed in collection/array in place (e.g. as an internal implementation detail) (e.g. appending to a list of errors and returning a boolean, converting a binary tree to a list of nodes where some predicate applies). The only way for a method to modify a passed in array parameter is by reference, and opcache is not able to optimize reference parameters/variables because their type at runtime cannot be guaranteed, passing in an object by value can be much faster than passing an array by reference.

    (Other approaches may end up copying arrays repeatedly, and take quadratic time instead of linear time, causing performance issues for large inputs).

    (Right now, some libraries do pass arrays by reference, because of how inefficient SplDoublyLinkedList is expected to be to create, modify, and read)

Also see the RFC introduction for pitfalls with trying to use a PHP array like a queue.

Caveats

For stack-like workflows that only require pushing and popping from the end of a data structure, array uses roughly the same amount of memory as Deque and is faster than Deque due to Use more compact representation for packed arrays. being merged into PHP 8.2

Usage

Examples of how each of these methods in Deque can be used can be found at https://www.php.net/spldoublylinkedlist - in most cases this can be used as a substitute for SplDoublyLinkedList.

See Implementation for the public API of Deque.

Implementation Choices

"Collections\" namespace

This name was chosen due to https://wiki.php.net/rfc/deque_straw_poll

https://wiki.php.net/rfc/namespaces_in_bundled_extensions recently passed. It offers guidance in choosing namespace names and allows for using namespaces in new categories of functionality.

Accepting an iterable

This accepts the values of the iterable in the order of iteration. Keys of the iterable are ignored (Because this is meant to be a double-ended queue, adding placeholders would not make sense if there were gaps in the array.)

Final Class

If this were overridable, this would have the following drawbacks:

  1. There would not be as strong guarantees to readers of PHP code using Deque (or even opcache, if optimizations were added targeting objects) that elements were actually a vector or that certain methods would/wouldn't throw certain exceptions, or that iteration would be possible. (if getIterator, ArrayAccess methods, etc. were overridable)
  2. More memory and runtime checks would be required to check if this was the original class or a subclass when fetching a value
  3. More memory would be required to mitigate any unexpected issues seen in the way subclasses overrid or used the base class.
  4. There would be a larger chance the implementation would have discovered or undiscovered bugs due to userland subclasses of Deque, in serialization/unserialization, reads or writes, resizing, future functionality added to PHP, opcache (if opcache optimizations were added), or future methods added to Deque, or causes that were not even thought of yet.

Making all functionality final turns out to be the same approach that the PECL project https://github.com/php-ds/ext-ds used for its datastructures. The php-ds's maintainers reasons are mentioned in https://medium.com/@rtheunissen/efficient-data-structures-for-php-7-9dda7af674cd and summarized or quoted below

  • “Prefer composition over inheritance”
  • “Inheritance would also introduce unnecessary internal complexity”
  • Avoid multiple methods doing the same thing (e.g. SplDeque has a redundant enqueue methods to add to the end of an array because the base class already had push())
  • “They are designed to be self-contained, much like an array. You can't extend an array, so we design our own APIs around it by using an internal array to store the actual data.”

push/pop/shift/unshift (and top()/bottom())

This is consistent with the name used for array_push()/array_pop()/array_shift()/array_unshift(), as well as names used for SplDoublyLinkedList

Backward Incompatible Changes

The class name \Collections\Deque is now used by PHP, and it will be a compilation error to declare classlikes of the same name in the namespace Collections since the class already exists.

Proposed PHP Version(s)

8.2

RFC Impact

To Opcache

None

Unaffected PHP Functionality

PHP's type system remains unchanged (e.g. array) - final class Collections\Deque is a class, and instances of Collections\Deque are ordinary objects.

Benchmarks

Two cycles of appending n values then shifting them from the front

Note that it is possible to have constant time removal from the front of a PHP array efficiently (as long as key stays at the front of the array), but it is not possible to have constant time prepending (`unshift`) to the front of an array. array_unshift is a linear time operation (takes time proportional to the current array size). So unshift is not included in these benchmarks.

Because there's a second cycle in this benchmark, array becomes an associative array and uses more memory than a packed array (https://www.npopov.com/2014/12/22/PHPs-new-hashtable-implementation.html). (At the time of writing, a packed array uses double the memory of a Deque, though there is an unrelated change in review proposing reducing the memory usage of packed arrays).

memory_get_usage is not counting the memory overhead of tracking the allocations of a lot of small objects, so the memory usage of SplDoublyLinkedList is under-reported. SplQueue is a subclass of SplDoublyLinkedList and I expect it would have the same performance.

Deque and array both always capacities that are powers of 2. This benchmark tests the best-case memory usage for Deque and array

Deque is faster than SplDoublyLinkedList at all sizes, and faster than array at medium and large sizes. The maximum memory usage is also noticeably lower than both array and SplDoublyLinkedList

Caveats for comparison with [https://github.com/php-ds/ext-ds](Ds\Deque PECL):

  • Ds\Deque was compiled and installed as a shared extension with phpize; ./configure; make install (shared reflects the default way to install PECL modules, e.g. what pecl install or an OS package manager would do or what copying the windows DLL would do), not statically compiled into php
  • Both Deque and Ds\Deque use circular buffers, so the performance is expected to be about the same.
  • This php-src PR has micro-optimizations such as storing the mask (instead of capacity) for faster bitwise ands, as well as the use of the EXPECTED/UNEXPECTED macros to hint the fast path to the compiler.
  • Having the extra step of installing a PECL may discourage application/library authors from writing code using a PECL or discourage some users from using those applications/libraries. While php-ds has a polyfill, the polyfill would be slower than native solutions - https://github.com/php-ds/ext-ds#compatibility
Test creating a collection, then two cycles of push+shift (adding n values to the end of collections then shifting all of them from front of collection)
Results for php 8.2.0-dev debug=false with opcache enabled=true
 
2x Push then shift from array:               n=       1 iterations= 8000000
=> max memory=     216 bytes, create+destroy time=1.125 shift time = 1.017 total time = 2.142 result=0
2x Push then shift from Collections\Deque:   n=       1 iterations= 8000000
=> max memory=     144 bytes, create+destroy time=1.567 shift time = 0.867 total time = 2.434 result=0
2x Push then shift from Ds\Deque:            n=       1 iterations= 8000000
=> max memory=     216 bytes, create+destroy time=1.904 shift time = 0.969 total time = 2.873 result=0
2x Push then shift from SplDoublyLinkedList: n=       1 iterations= 8000000
=> max memory=     184 bytes, create+destroy time=1.988 shift time = 0.941 total time = 2.929 result=0
 
2x Push then shift from array:               n=       4 iterations= 2000000
=> max memory=     216 bytes, create+destroy time=0.394 shift time = 0.671 total time = 1.065 result=24000000
2x Push then shift from Collections\Deque:   n=       4 iterations= 2000000
=> max memory=     144 bytes, create+destroy time=0.509 shift time = 0.458 total time = 0.967 result=24000000
2x Push then shift from Ds\Deque:            n=       4 iterations= 2000000
=> max memory=     216 bytes, create+destroy time=0.611 shift time = 0.528 total time = 1.139 result=24000000
2x Push then shift from SplDoublyLinkedList: n=       4 iterations= 2000000
=> max memory=     280 bytes, create+destroy time=0.993 shift time = 0.510 total time = 1.503 result=24000000
 
2x Push then shift from array:               n=       8 iterations= 1000000
=> max memory=     376 bytes, create+destroy time=0.293 shift time = 0.611 total time = 0.904 result=56000000
2x Push then shift from Collections\Deque:   n=       8 iterations= 1000000
=> max memory=     208 bytes, create+destroy time=0.387 shift time = 0.471 total time = 0.859 result=56000000
2x Push then shift from Ds\Deque:            n=       8 iterations= 1000000
=> max memory=     216 bytes, create+destroy time=0.410 shift time = 0.486 total time = 0.896 result=56000000
2x Push then shift from SplDoublyLinkedList: n=       8 iterations= 1000000
=> max memory=     408 bytes, create+destroy time=0.784 shift time = 0.470 total time = 1.253 result=56000000
 
2x Push then shift from array:               n=    1024 iterations=   20000
=> max memory=   41016 bytes, create+destroy time=0.430 shift time = 1.380 total time = 1.810 result=20951040000
2x Push then shift from Collections\Deque:   n=    1024 iterations=   20000
=> max memory=   16464 bytes, create+destroy time=0.428 shift time = 0.902 total time = 1.329 result=20951040000
2x Push then shift from Ds\Deque:            n=    1024 iterations=   20000
=> max memory=   16472 bytes, create+destroy time=0.494 shift time = 1.102 total time = 1.596 result=20951040000
2x Push then shift from SplDoublyLinkedList: n=    1024 iterations=   20000
=> max memory=   32920 bytes, create+destroy time=1.504 shift time = 0.959 total time = 2.463 result=20951040000
 
2x Push then shift from array:               n= 1048576 iterations=      20
=> max memory=41943120 bytes, create+destroy time=0.889 shift time = 1.420 total time = 2.309 result=21990211584000
2x Push then shift from Collections\Deque:   n= 1048576 iterations=      20
=> max memory=16777320 bytes, create+destroy time=0.856 shift time = 1.017 total time = 1.873 result=21990211584000
2x Push then shift from Ds\Deque:            n= 1048576 iterations=      20
=> max memory=16777328 bytes, create+destroy time=0.889 shift time = 1.023 total time = 1.913 result=21990211584000
2x Push then shift from SplDoublyLinkedList: n= 1048576 iterations=      20
=> max memory=33554584 bytes, create+destroy time=1.572 shift time = 1.006 total time = 2.578 result=21990211584000

Only appending to a Deque and reading elements without removal

Note that the proposed Deque as well as the existing SplDoublyLinkedList/SplStack are expected to perform equally well at shifting to (adding to) or unshifting from(removing from) the front of an array. (Shifting or unshifting (or unsetting) the front is inefficient for arrays)

For this benchmark, Deque can be created and read from faster than the fastest methods of reading SplStack or SplFixedArray.

  • Note that a foreach is used instead of random access for SplStack, since SplStack random access time is proportional to the number of linked list nodes that need to be iterated over to read a value.
  • SplFixedArray would be faster to append to if it had a push() method, but there isn't one.

Major notes:

  • Collections\Deque and SplFixedArray use less memory than other available options.
  • Collections\Deque is faster than object data structures currently available in the SPL.
  • In php 8.2+, array is faster than all object data structures, and memory usage of large packed arrays is half of what would be used in 8.1 due to https://github.com/php/php-src/pull/7491
Results for php 8.2.0-dev debug=false with opcache enabled=true
 
Appending to array:         n=       1 iterations= 8000000 memory=     216 bytes, create+destroy time=0.612 read time = 0.303 result=0
Appending to Deque:         n=       1 iterations= 8000000 memory=     144 bytes, create+destroy time=1.009 read time = 0.365 result=0
Appending to SplStack:      n=       1 iterations= 8000000 memory=     184 bytes, create+destroy time=1.735 read time = 0.725 result=0
Appending to SplFixedArray: n=       1 iterations= 8000000 memory=      80 bytes, create+destroy time=1.816 read time = 0.438 result=0
 
 
Appending to array:         n=       4 iterations= 2000000 memory=     216 bytes, create+destroy time=0.219 read time = 0.119 result=12000000
Appending to Deque:         n=       4 iterations= 2000000 memory=     144 bytes, create+destroy time=0.340 read time = 0.176 result=12000000
Appending to SplStack:      n=       4 iterations= 2000000 memory=     280 bytes, create+destroy time=0.732 read time = 0.295 result=12000000
Appending to SplFixedArray: n=       4 iterations= 2000000 memory=     128 bytes, create+destroy time=1.186 read time = 0.246 result=12000000
 
 
Appending to array:         n=       8 iterations= 1000000 memory=     216 bytes, create+destroy time=0.147 read time = 0.086 result=28000000
Appending to Deque:         n=       8 iterations= 1000000 memory=     208 bytes, create+destroy time=0.244 read time = 0.163 result=28000000
Appending to SplStack:      n=       8 iterations= 1000000 memory=     408 bytes, create+destroy time=0.529 read time = 0.235 result=28000000
Appending to SplFixedArray: n=       8 iterations= 1000000 memory=     192 bytes, create+destroy time=1.057 read time = 0.222 result=28000000
 
 
Appending to array:         n= 1048576 iterations=      20 memory=16781392 bytes, create+destroy time=0.437 read time = 0.157 result=10995105792000
Appending to Deque:         n= 1048576 iterations=      20 memory=16777320 bytes, create+destroy time=0.528 read time = 0.305 result=10995105792000
Appending to SplStack:      n= 1048576 iterations=      20 memory=33554584 bytes, create+destroy time=0.884 read time = 0.401 result=10995105792000
Appending to SplFixedArray: n= 1048576 iterations=      20 memory=16777304 bytes, create+destroy time=2.575 read time = 0.428 result=10995105792000

Future Scope

If Collections\Deque is added, there would be plenty of time for myself or others to propose additional methods before PHP 8.2's feature freeze (probably in July 2022)

Future additions to https://github.com/TysonAndre/pecl-teds that are general purpose enough may be possible as well.

The php-ds maintainers are also open to borrowing from the implementation of php-ds to implement new datastructures.

My position currently is that I think we should start from scratch and borrow whatever is good from php-ds that we want to keep to implement something natively in core. In the 4 or 5 years since I wrote this extension I've been studying persistent data structures in-depth and there are a lot of decisions that I made then that I would do differently now. I would like to be a part of the design and implementation of the data structures themselves, but I do not have the understanding or capacity to be involved in work relating to the engine or its integration. It seems unlikely that a 2.0 of this extension will come about, I'm not convinced that a complete rework would be a good investment of our time.

Proposed Voting Choices

Yes/No vote, requiring a 2/3 majority

References

Rejected Features

Why not use php-ds/ext-ds instead?

  1. No matter how useful or popular a PECL is, datastructures available in PHP's core will have much, much wider adoption in applications and libraries that are available in PECLs, allowing those applications and libraries to write faster and/or more memory efficient code.
  2. End users can make much stronger assumptions about the backwards compatibility and long-term availability of data structures that are included in core.
  3. The php-ds maintainers do not plan to merge the extension into php-src, and believe php-ds should coexist with new functionality being added in a separate namespace instead (see quote and later clarifications for full context)
  4. Opcache may be able to make stronger optimizations of internal classes found in php-src than any third party PECL. (e.g. because Collections\Deque::push() or Collections\Vector::push() would never throw or emit notices, it may be possible to optimize it to be even faster than appending to an array in the Opcache JIT compiler)

php-ds maintainer response

This has been asked about multiple times in threads on unrelated proposals (https://externals.io/message/112639#112641 and https://externals.io/message/93301#93301 years ago) throughout the years.

The php-ds's maintainer's position is that “we should start from scratch and borrow whatever is good from php-ds that we want to keep to implement something natively in core.”

On September 24, 2021, https://github.com/php-ds/ext-ds/issues/156#issuecomment-926353779

Hi everyone, I am happy to see this discussion and I thank you all for taking part. My reservation to merge ds into core has always been because I wanted to make sure we get it right before we do that and the intention behind the mythical v2 was to achieve that, based on learnings from v1 and feedback from the community. I have no personal attachment to this project, I only want what is best for PHP and the community.

I would love to see a dedicated, super-lean vec data structure in core that has native iteration and all the other same internal benefits as arrays. In my opinion, the API should be very minimal and potentially compatible with all the non-assoc array functions. An OO interface can easily be designed around that. I'm imagining something similar to Golang's slices.

As for the future of ds itself, I think these can co-exist and ds can remain external. I've been researching and designing immutable data structures over the last 4 years and I still hope to develop a v2 that simplifies the interfaces and introduces immutable structures. Attempting to implement a suite of structures in core or an OO vector would take a lot of work and might be difficult to reach consensus on with the API. I don't think we should attempt to merge ds into core at any time.

I am currently traveling and have not followed this discussion in detail on the mailing list. I'd be happy to assist in any way I can and will catch up as soon as I am home again this week. Feel free to quote this response on the mailing list as well.

On December 13, 2021, https://github.com/php-ds/ext-ds/issues/156#issuecomment-992773289

My position currently is that I think we should start from scratch and borrow whatever is good from php-ds that we want to keep to implement something natively in core. In the 4 or 5 years since I wrote this extension I've been studying persistent data structures in-depth and there are a lot of decisions that I made then that I would do differently now. I would like to be a part of the design and implementation of the data structures themselves, but I do not have the understanding or capacity to be involved in work relating to the engine or its integration. It seems unlikely that a 2.0 of this extension will come about, I'm not convinced that a complete rework would be a good investment of our time.

Minor differences in API design goals

Traditionally, PHP has been a very batteries included language. Existing functionality such as strings and arrays have very large standard libraries. This makes it easier to write code without depending on too many third party composer libraries, and knowledge of the standard library can transfer to any codebase a developer works on.

My hopes for ease of use, readability, speed, and static analysis in future data structures such as Vector are similar to those mentioned by Benjamin Morel in the GitHub issue:

Functions like map, filter, reduce can be delegated to other libraries that operate on iterable instead of having these as first-class members of the interface.

Again, I understand the rationale behind this decision, like reducing duplication and keeping only the core functionality in DS. However, sometimes you have to take into consideration ease of use vs purity of the code.

Ease of use / DX / readability: it seems more logical to me to do:

$map->filter(fn(...) => ...);

Rather than:

Some\filter($map, fn(...) => ...);

Speed: as you said, internal iteration is faster. And speed is one of the selling points of DS vs arrays.

Static analysis: I love the fact that Map::filter() can be strictly typed as returning Map<TKey, TValue> in Psalm, for example. If you rely on a generic filter() function, I'm not sure such strict typing will be easy or even possible.

Thank you for your work on DS anyway, I already use the extension in my closed-source project, in particular Map. I would love to use data structures in my open-source projects, one day! 🤞

Additionally, it may be inconvenient for end users (e.g. new contributors to projects) to remember specifics of multiple libraries or utility classes when working on different codebases, to deal with dependency conflicts after major version upgrades, or to deal with libraries dropping support for older php versions, getting abandoned, etc.

Update: php-ds maintainer response clarifications

On September 24, 2021, the maintainer responded after being asked about current plans for php-ds

Hi everyone, I am happy to see this discussion and I thank you all for taking part. My reservation to merge ds into core has always been because I wanted to make sure we get it right before we do that and the intention behind the mythical v2 was to achieve that, based on learnings from v1 and feedback from the community. I have no personal attachment to this project, I only want what is best for PHP and the community.

I would love to see a dedicated, super-lean vec data structure in core that has native iteration and all the other same internal benefits as arrays. In my opinion, the API should be very minimal and potentially compatible with all the non-assoc array functions. An OO interface can easily be designed around that. I'm imagining something similar to Golang's slices.

As for the future of ds itself, I think these can co-exist and ds can remain external. I've been researching and designing immutable data structures over the last 4 years and I still hope to develop a v2 that simplifies the interfaces and introduces immutable structures. Attempting to implement a suite of structures in core or an OO vector would take a lot of work and might be difficult to reach consensus on with the API. I don't think we should attempt to merge ds into core at any time.

I am currently traveling and have not followed this discussion in detail on the mailing list. I'd be happy to assist in any way I can and will catch up as soon as I am home again this week. Feel free to quote this response on the mailing list as well.

I'm still awaiting some clarifications on how they they were willing to assist before updating the remainder of this RFC.

Additionally, there may be differences in design goals, as noted in the above section.

Appendix

Benchmark Source Code

Benchmarking repeated push and shift operations

<?php
 
const PUSH_SHIFT_CYCLES = 2;
 
function bench_array(int $n, int $iterations) {
    $totalReadTime = 0.0;
    $startTime = hrtime(true);
    $total = 0;
    for ($j = 0; $j < $iterations; $j++) {
        $startMemory = memory_get_usage();
        $values = [];
        for ($times = 0; $times < PUSH_SHIFT_CYCLES; $times++) {
            for ($i = 0; $i < $n; $i++) {
                $values[] = $i;
            }
            $maxMemory = memory_get_usage();
            $startReadTime = hrtime(true);
            while (count($values) > 0) {
                // Pretend we don't know the actual position of the first array key for this simulated benchmark
                // array_shift is not used because it is linear time (to move all other elements)
                // rather than constant time.
                //
                // This approach is simple at the cost of memory - it converts a packed array to a non-packed array
                // NOTE: Adding a call to reset() is not necessary in this case, and would result in worse performance.
                // NOTE: array_key_first results in quadratic performance for this synthetic benchmark.
                // $key = array_key_first($values);
                $key = key($values);
                $total += $values[$key];
                unset($values[$key]);
            }
            $endReadTime = hrtime(true);
            $totalReadTime += $endReadTime - $startReadTime;
        }
 
        unset($values);
    }
    $endTime = hrtime(true);
 
    $totalTime = ($endTime - $startTime) / 1000000000;
    $totalReadTimeSeconds = $totalReadTime / 1000000000;
    printf("2x Push then shift from array:               n=%8d iterations=%8d\n=> max memory=%8d bytes, create+destroy time=%.3f shift time = %.3f total time = %.3f result=%d\n",
        $n, $iterations, $maxMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $totalTime, $total);
}
function bench_deque(int $n, int $iterations) {
    $startTime = hrtime(true);
    $totalReadTime = 0.0;
    $total = 0;
    for ($j = 0; $j < $iterations; $j++) {
        $startMemory = memory_get_usage();
        $values = new Collections\Deque();
        for ($times = 0; $times < PUSH_SHIFT_CYCLES; $times++) {
            for ($i = 0; $i < $n; $i++) {
                $values[] = $i;
            }
            $maxMemory = memory_get_usage();
 
            $startReadTime = hrtime(true);
            while (count($values) > 0) {
                $total += $values->shift();
            }
            $endReadTime = hrtime(true);
            $totalReadTime += $endReadTime - $startReadTime;
        }
 
        unset($values);
    }
    $endTime = hrtime(true);
    $totalTime = ($endTime - $startTime) / 1000000000;
    $totalReadTimeSeconds = $totalReadTime / 1000000000;
    printf("2x Push then shift from Collections\Deque:   n=%8d iterations=%8d\n=> max memory=%8d bytes, create+destroy time=%.3f shift time = %.3f total time = %.3f result=%d\n",
        $n, $iterations, $maxMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $totalTime, $total);
}
function bench_ds_deque(int $n, int $iterations) {
    $startTime = hrtime(true);
    $totalReadTime = 0.0;
    $total = 0;
    for ($j = 0; $j < $iterations; $j++) {
        $startMemory = memory_get_usage();
        $values = new Ds\Deque();
        for ($times = 0; $times < PUSH_SHIFT_CYCLES; $times++) {
            for ($i = 0; $i < $n; $i++) {
                $values[] = $i;
            }
            $maxMemory = memory_get_usage();
 
            $startReadTime = hrtime(true);
            while (count($values) > 0) {
                $total += $values->shift();
            }
            $endReadTime = hrtime(true);
            $totalReadTime += $endReadTime - $startReadTime;
        }
 
        unset($values);
    }
    $endTime = hrtime(true);
    $totalTime = ($endTime - $startTime) / 1000000000;
    $totalReadTimeSeconds = $totalReadTime / 1000000000;
    printf("2x Push then shift from Ds\Deque:            n=%8d iterations=%8d\n=> max memory=%8d bytes, create+destroy time=%.3f shift time = %.3f total time = %.3f result=%d\n",
        $n, $iterations, $maxMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $totalTime, $total);
}
// SplDoublyLinkedList is a linked list that takes more memory than needed.
// Access to values in the middle of the SplDoublyLinkedList is also less efficient.
function bench_spl_dll(int $n, int $iterations) {
    $startTime = hrtime(true);
    $totalReadTime = 0.0;
    $total = 0;
    for ($j = 0; $j < $iterations; $j++) {
        $startMemory = memory_get_usage();
        $values = new SplDoublyLinkedList();
        for ($times = 0; $times < PUSH_SHIFT_CYCLES; $times++) {
            for ($i = 0; $i < $n; $i++) {
                $values->push($i);
            }
            $maxMemory = memory_get_usage();
            $startReadTime = hrtime(true);
            // Random access is linear time in a linked list, so use foreach instead
            while (count($values) > 0) {
                $total += $values->shift();
            }
            $endReadTime = hrtime(true);
            $totalReadTime += $endReadTime - $startReadTime;
        }
        unset($values);
    }
    $endTime = hrtime(true);
    $totalTime = ($endTime - $startTime) / 1000000000;
    $totalReadTimeSeconds = $totalReadTime / 1000000000;
    printf("2x Push then shift from SplDoublyLinkedList: n=%8d iterations=%8d\n=> max memory=%8d bytes, create+destroy time=%.3f shift time = %.3f total time = %.3f result=%d\n",
        $n, $iterations, $maxMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $totalTime, $total);
}
$n = 2**20;
$iterations = 10;
$sizes = [
    [1, 8000000],
    [4, 2000000],
    [8, 1000000],
    [1024, 20000],
    [2**20, 20],
];
echo "Test creating a collection, then two cycles of push+shift (adding n values to the end of collections then shifting all of them from front of collection)\n";
printf(
    "Results for php %s debug=%s with opcache enabled=%s\n\n",
    PHP_VERSION,
    PHP_DEBUG ? 'true' : 'false',
    json_encode(function_exists('opcache_get_status') && (opcache_get_status(false)['opcache_enabled'] ?? false))
);
 
foreach ($sizes as [$n, $iterations]) {
    bench_array($n, $iterations);
    bench_deque($n, $iterations);
    bench_ds_deque($n, $iterations);
    bench_spl_dll($n, $iterations);
    echo "\n";
}

Benchmarking only appending to a Deque and reading elements without removal

<?php
 
function bench_array(int $n, int $iterations) {
    $totalReadTime = 0.0;
    $startTime = hrtime(true);
    $total = 0;
    for ($j = 0; $j < $iterations; $j++) {
        $startMemory = memory_get_usage();
        $values = [];
        for ($i = 0; $i < $n; $i++) {
            $values[] = $i;
        }
        $startReadTime = hrtime(true);
        for ($i = 0; $i < $n; $i++) {
            $total += $values[$i];
        }
        $endReadTime = hrtime(true);
        $totalReadTime += $endReadTime - $startReadTime;
 
        $endMemory = memory_get_usage();
        unset($values);
    }
    $endTime = hrtime(true);
 
    $totalTime = ($endTime - $startTime) / 1000000000;
    $totalReadTimeSeconds = $totalReadTime / 1000000000;
    printf("Appending to array:             n=%8d iterations=%8d memory=%8d bytes, create+destroy time=%.3f read time = %.3f result=%d\n",
        $n, $iterations, $endMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $total);
}
function bench_deque(int $n, int $iterations) {
    $startTime = hrtime(true);
    $totalReadTime = 0.0;
    $total = 0;
    for ($j = 0; $j < $iterations; $j++) {
        $startMemory = memory_get_usage();
        $values = new Collections\Deque();
        for ($i = 0; $i < $n; $i++) {
            $values[] = $i;
        }
 
        $startReadTime = hrtime(true);
        for ($i = 0; $i < $n; $i++) {
            $total += $values[$i];
        }
        $endReadTime = hrtime(true);
        $totalReadTime += $endReadTime - $startReadTime;
 
        $endMemory = memory_get_usage();
        unset($values);
    }
    $endTime = hrtime(true);
    $totalTime = ($endTime - $startTime) / 1000000000;
    $totalReadTimeSeconds = $totalReadTime / 1000000000;
    printf("Appending to Collections\Deque: n=%8d iterations=%8d memory=%8d bytes, create+destroy time=%.3f read time = %.3f result=%d\n",
        $n, $iterations, $endMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $total);
}
function bench_ds_deque(int $n, int $iterations) {
    $startTime = hrtime(true);
    $totalReadTime = 0.0;
    $total = 0;
    for ($j = 0; $j < $iterations; $j++) {
        $startMemory = memory_get_usage();
        $values = new Ds\Deque();
        for ($i = 0; $i < $n; $i++) {
            $values[] = $i;
        }
 
        $startReadTime = hrtime(true);
        for ($i = 0; $i < $n; $i++) {
            $total += $values[$i];
        }
        $endReadTime = hrtime(true);
        $totalReadTime += $endReadTime - $startReadTime;
 
        $endMemory = memory_get_usage();
        unset($values);
    }
    $endTime = hrtime(true);
    $totalTime = ($endTime - $startTime) / 1000000000;
    $totalReadTimeSeconds = $totalReadTime / 1000000000;
    printf("Appending to Ds\Deque:          n=%8d iterations=%8d memory=%8d bytes, create+destroy time=%.3f read time = %.3f result=%d\n",
        $n, $iterations, $endMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $total);
}
// SplStack is a subclass of SplDoublyLinkedList, so it is a linked list that takes more memory than needed.
// Access to values in the middle of the SplStack is also less efficient.
function bench_spl_stack(int $n, int $iterations) {
    $startTime = hrtime(true);
    $totalReadTime = 0.0;
    $total = 0;
    for ($j = 0; $j < $iterations; $j++) {
        $startMemory = memory_get_usage();
        $values = new SplStack();
        for ($i = 0; $i < $n; $i++) {
            $values->push($i);
        }
        $startReadTime = hrtime(true);
        // Random access is linear time in a linked list, so use foreach instead
        foreach ($values as $value) {
            $total += $value;
        }
        $endReadTime = hrtime(true);
        $totalReadTime += $endReadTime - $startReadTime;
        $endMemory = memory_get_usage();
        unset($values);
    }
    $endTime = hrtime(true);
    $totalTime = ($endTime - $startTime) / 1000000000;
    $totalReadTimeSeconds = $totalReadTime / 1000000000;
    printf("Appending to SplStack:          n=%8d iterations=%8d memory=%8d bytes, create+destroy time=%.3f read time = %.3f result=%d\n",
        $n, $iterations, $endMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $total);
}
function bench_spl_fixed_array(int $n, int $iterations) {
    $startTime = hrtime(true);
    $totalReadTime = 0.0;
    $total = 0;
    for ($j = 0; $j < $iterations; $j++) {
        $startMemory = memory_get_usage();
        $values = new SplFixedArray();
        for ($i = 0; $i < $n; $i++) {
            // Imitate how push() would be implemented in a situation
            // where the number of elements wasn't actually known ahead of time.
            // erealloc() tends to extend the existing array when possible.
            $size = $values->getSize();
            $values->setSize($size + 1);
            $values->offsetSet($size, $i);
        }
        $startReadTime = hrtime(true);
        for ($i = 0; $i < $n; $i++) {
            $total += $values[$i];
        }
        $endReadTime = hrtime(true);
        $totalReadTime += $endReadTime - $startReadTime;
        $endMemory = memory_get_usage();
        unset($values);
    }
    $endTime = hrtime(true);
    $totalTime = ($endTime - $startTime) / 1000000000;
    $totalReadTimeSeconds = $totalReadTime / 1000000000;
    printf("Appending to SplFixedArray:     n=%8d iterations=%8d memory=%8d bytes, create+destroy time=%.3f read time = %.3f result=%d\n\n",
        $n, $iterations, $endMemory - $startMemory, $totalTime - $totalReadTimeSeconds, $totalReadTimeSeconds, $total);
}
$n = 2**20;
$iterations = 10;
$sizes = [
    [1, 8000000],
    [4, 2000000],
    [8, 1000000],
    [2**20, 20],
];
printf(
    "Results for php %s debug=%s with opcache enabled=%s\n\n",
    PHP_VERSION,
    PHP_DEBUG ? 'true' : 'false',
    json_encode(function_exists('opcache_get_status') && (opcache_get_status(false)['opcache_enabled'] ?? false))
);
 
foreach ($sizes as [$n, $iterations]) {
    bench_array($n, $iterations);
    bench_deque($n, $iterations);
    bench_ds_deque($n, $iterations);
    bench_spl_stack($n, $iterations);
    bench_spl_fixed_array($n, $iterations);
    echo "\n";
}

Changelog

  • 0.2: Link to definition of Deque.
  • 0.3: Add php-ds maintainer's response clarifications
  • 0.4: Update benchmarks after https://github.com/php/php-src/pull/7491 memory and time optimizations for packed arrays were merged into PHP 8.2
  • 0.5: Change proposed iteration behavior to iterate over the original Deque rather than a copy, having the Deque account for shift/unshift modifying the position of the start of the array (to avoid iterating over elements too many/few times if the front of the internal representation changes position). Update benchmark result. Update responses from php-ds maintainers.
  • 0.6: Rename to Collections\Deque
  • 0.6.1: Update section for rationale on namespace choice. Link to WebAssembly demo.
  • 0.6.2: Update benchmarks code/results after iteration implementation change in 0.5.
rfc/deque.1643682586.txt.gz · Last modified: 2022/02/01 02:29 by tandre