rfc:deque_straw_poll

Straw poll: Naming pattern to use for Deque

Introduction

Because the naming choice for new datastructures is a question that has been asked about many times, this straw poll has been created to gather feedback on the naming pattern to use for future additions of datastructures to PHP, with the arguments for and against the naming pattern.

https://wiki.php.net/rfc/namespaces_in_bundled_extensions recently passed. It permits using the same namespace that is already used in an extension, but offers guidance in choosing namespace names and allows for using namespaces in new categories of functionality.

The options in this straw poll are:

  • \Collections\Deque - seems like a reasonable choice of name for collections (Deque, and possible future additions such as Vector, Set, Map, Sorted Sets/Maps, etc. https://wiki.php.net/rfc/namespaces_in_bundled_extensions also allows using sub-namespaces and that may be used for things that aren't strictly collections, e.g. Collections\Builder\SomethingBuilder or functions operating on collections.
  • \SplDeque, similar to datastructures added to the Spl in PHP 5.3.
    (I don't prefer that name because SplDoublyLinkedList, SplStack, and SplQueue are subclasses of a doubly linked list with poor performance (accessing the offset in the middle of a linked list requires traversing half the linked list, for example), and this name would easily get confused with them. Also, historically, none of the functionality with that naming pattern has been final.
    However, good documentation (e.g. suggesting *Deque instead where possible in the manual) would make that less of an issue.)

    See https://wiki.php.net/rfc/deque#lack_of_name_prefix (and arguments for https://externals.io/message/116100#116111)

While there is considerable division in whether or not members of internals want to adopt namespaces, I hope that the final outcome of the poll will be accepted by members of internals as what the representative of the majority of the members of internals (from diverse backgrounds such as contributors/leaders of userland applications/frameworks/composer libraries written in PHP, documentation contributors, PECL authors, php-src maintainers, etc. (all of which I expect are also end users of php)) want to use as a naming choice in future datastructure additions to PHP. (and I hope there is a clear majority)

What is the Deque RFC?

The Deque RFC would add final class *Deque to PHP. From the Wikipedia article for the Double-Ended Queue:

In computer science, a double-ended queue (abbreviated to deque, pronounced deck, like “cheque”) is an abstract data type that generalizes a queue, for which elements can be added to or removed from either the front (head) or back (tail).

(omitted sections, see article)

The dynamic array approach uses a variant of a dynamic array that can grow from both ends, sometimes called array deques. These array deques have all the properties of a dynamic array, such as constant-time random access, good locality of reference, and inefficient insertion/removal in the middle, with the addition of amortized constant-time insertion/removal at both ends, instead of just one end. Three common implementations include:

  • Storing deque contents in a circular buffer, and only resizing when the buffer becomes full. This decreases the frequency of resizings.
  • (omitted, see article)

Similarly to deques in other languages (such as Python collections.deque, C++ std::deque, Rust std::collections::VecDeque, etc.) this is backed by a memory-efficient representation with good cache locality and provides constant amortized-time push/pop/shift/unshift operations. (this RFC's implementation is a circular buffer represented as a raw C array of values with a size, offset, and capacity)

What is the planned future scope?

The planned future scope is to add new more efficient datastructures such as Deque, Vector, Set, Map, SortedSet/SortedMap, and potentially immutable data structures to php. Those would differ from existing classes in the spl in several ways:

  • Be final classes (or have final methods when extensibility is needed for core functionality) and avoid unpredictable behaviors. Extensibility has been a cause of performance issues and unexpected bugs historically, and makes code harder to reason about.
    For example, Deque would be memory and time efficient compared to SplDoublyLinkedList
  • Forbid dynamic properties.

Discussion

Arguments for/against Deque

The global namespace was my preference because it was short, making it easy to remember and convenient to use. The majority of existing internal classes use the global namespace.

https://wiki.php.net/rfc/namespaces_in_bundled_extensions recently passed. It permits using the same namespace that is already used in an extension (in this case, the global namespace), but offers guidance in choosing namespace names and allows for using namespaces in new categories of functionality.

My reason for proposing Deque in the global namespace is:

This maintains consistency with the namespace used for general-purpose collections already in the SPL (as well as relatively recent additions such as WeakReference (PHP 7.4) and WeakMap (PHP 8.0)). Other recent additions to PHP such as ReflectionIntersectionType in PHP 8.1 have also continued to use the global namespace when adding classes with functionality related to other classes.

Additionally, prior polls for namespacing choices of other datastructure functionality showed preferences for namespacing and not namespacing were evenly split in a straw poll for a new iterable type.

Introducing a new namespace for data structures would also raise the question of whether existing datastructures should be moved to that new namespace (for consistency), and that process of moving namespaces would (if started):

  1. Raise the amount of work needed for end users or library/framework/application authors to migrate to new PHP versions.
  2. Cause confusion and inconvenience for years about which namespace can or should be used in an application (SplObjectStorage vs Xyz\SplObjectStorage), especially for developers working on projects supporting different php version ranges.
  3. Prevent applications/libraries from easily supporting as wide of a range of php versions.
  4. Cause serialization/unserialization issues when migrating to different php versions, if the old or new class name in the serialized data did not exist in the other php version and was not aliased. For example, if the older PHP version could not unserialize() Xyz\SplObjectStorage and would silently create a __PHP_Incomplete_Class_Name without any warnings or notices.

https://externals.io/message/116100#116111

The choice of global namespace maintains consistency with the namespace used for general-purpose collections already in the SPL

I find this argument unconvincing. If the intention is for this to fit with existing classes in the SPL, it should be called “SplDeque”, or more consistently “SplDoubleEndedQueue”, and the RFC should talk about how the design aligns with those existing classes.

If it is intended to be the first of a new set of data structures which are not aligned with the existing SPL types, then putting it in a new namespace would make most sense.

In the RFC and the list you've mentioned a few comparisons, but I don't think any of them hold:

  • ArrayObject, WeakReference, and WeakMap are all classes for binding to specific engine behaviour, not generic data structures
  • Iterators all have an “Iterator” suffix (leading to some quite awkward names)
  • Reflection classes all have a “Reflection” prefix
  • Having both “Queue” and “SplQueue”, or both “Stack” and “SplStack” would be a terrible idea, and is a pretty strong argument not to add data structures with such plain names

Arguments for/against Collections\Deque

This seems like a reasonable choice of name for collections (Deque, and possible future additions such as Vector, Set, Map, Sorted Sets/Maps, etc. https://wiki.php.net/rfc/namespaces_in_bundled_extensions also allows using sub-namespaces and that may be used for things that aren't strictly collections, e.g. Collections\Builder\SomethingBuilder.

Levi Morrison mentions in https://externals.io/message/116100#116119

Given that I've suggested the most common options for these across many languages, it (Collections\) shouldn't be very controversial. The worst bit seems like picking the namespace “Collections” as this will break at least one package: https://github.com/danielgsims/php-collections. We should suggest that they vendor it anyway, as “collections” is common e.g. “Ramsey\Collections”, “Doctrine\Common\Collections”. I don't see a good alternative here to “Collections”, as we haven't agreed on very much on past namespace proposals.

(note that https://github.com/danielgsims/php-collections does not declare Collections\Deque anywhere: https://github.com/danielgsims/php-collections/tree/2.X/src)

Why plural?

This proposed namespace is plural because:

Arguments for/against SplDeque

This would follow the naming convention used by other collections already existing in the SPL. This is included in the straw poll just in case the lack of this option is mentioned as a last-minute objection to the Deque RFC itself.

I believe the name SplDeque would be a source of frustration and performance issues for users, though. a SplDeque would be confused way too easily with SplDoublyLinkedList and it subclasses SplQueue/SplStack - switching from a SplDeque to a SplStack (linked list) would suddenly make operations such as ArrayAccess much slower and use more memory due to needing to traverse a linked list. See https://wiki.php.net/rfc/deque#benchmarks

  • Also, people may not realize there's a difference in performance that would be a reason to migrate existing Spl data structures to *Deque or to adopt *Deque in the first place.

Vote

Voting started on 2022-01-12 and ends on 2022-01-26.

This vote will influence the name choice for the Deque RFC

This is a ranked-choice poll (following STV) between the naming alternatives.

With STV you SHOULD rank all the choices in order (but are not required to). Don't pick the same option more than once, as that invalidates your vote.

First choice:

Straw poll: Favorite choice of naming pattern
Real name ''Deque'' ''Collections\Deque'' ''SplDeque''
ashnazg (ashnazg)   
crell (crell)   
derick (derick)   
dharman (dharman)   
kguest (kguest)   
kocsismate (kocsismate)   
levim (levim)   
marandall (marandall)   
mauricio (mauricio)   
mcmic (mcmic)   
ocramius (ocramius)   
santiagolizardo (santiagolizardo)   
sergey (sergey)   
svpernova09 (svpernova09)   
tandre (tandre)   
theodorejb (theodorejb)   
twosee (twosee)   
Count: 2 15 0

Second choice:

Straw poll: Second favorite choice of naming pattern
Real name ''Deque'' ''Collections\Deque'' ''SplDeque''
ashnazg (ashnazg)   
crell (crell)   
derick (derick)   
dharman (dharman)   
kguest (kguest)   
levim (levim)   
mauricio (mauricio)   
mcmic (mcmic)   
ocramius (ocramius)   
santiagolizardo (santiagolizardo)   
sergey (sergey)   
svpernova09 (svpernova09)   
tandre (tandre)   
theodorejb (theodorejb)   
twosee (twosee)   
Count: 12 2 1

Third choice:

Straw poll: Third favorite choice of naming pattern
Real name ''Deque'' ''Collections\Deque'' ''SplDeque''
ashnazg (ashnazg)   
derick (derick)   
dharman (dharman)   
kguest (kguest)   
levim (levim)   
mauricio (mauricio)   
mcmic (mcmic)   
ocramius (ocramius)   
santiagolizardo (santiagolizardo)   
sergey (sergey)   
svpernova09 (svpernova09)   
theodorejb (theodorejb)   
twosee (twosee)   
Count: 1 0 12

Rejected options

Several suggestions that have been brought up in the past are forbidden by the accepted policy RFC (https://wiki.php.net/rfc/namespaces_in_bundled_extensions) and can't be used in an RFC.

  • Spl\, Core\, and Standard\ are forbidden: “Because these extensions combine a lot of unrelated or only tangentially related functionality, symbols should not be namespaced under the Core, Standard or Spl namespaces.
    Instead, these extensions should be considered as a collection of different components, and should be namespaced according to these.”
  • Namespace names should generally follow CamelCase.

Some other options were also not considered:

References

  1. https://externals.io/message/116100 Adding `final class Deque` to PHP
  2. https://externals.io/message/116112 “(Planned) Straw poll: Naming pattern for *Deque

the plural form is proposed because this might grow long-term to contain not just collections, but also functionality related to collections in the future(e.g. helper classes for building classes (e.g. ImmutableSequenceBuilder for building an ImmutableSequence), global functions, traits/interfaces, collections of static methods, etc.
(especially since https://wiki.php.net/rfc/namespaces_in_bundled_extensions prevents more than one level of namespaces)

Changelog

0.2: Fix description of Collections\Deque naming choice in the introduction. 0.3: Change introduction sections

rfc/deque_straw_poll.txt · Last modified: 2022/01/12 13:52 by tandre