rfc:vector
Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
rfc:vector [2021/09/16 14:04] – created tandre | rfc:vector [2021/09/26 16:46] (current) – php-ds maintainer response tandre | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== PHP RFC: final class Vector ====== | ====== PHP RFC: final class Vector ====== | ||
- | * Version: 0.1 | + | * Version: 0.2 |
* Date: 2021-09-16 | * Date: 2021-09-16 | ||
* Author: Tyson Andre, tandre@php.net | * Author: Tyson Andre, tandre@php.net | ||
- | * Status: | + | * Status: |
* Implementation: | * Implementation: | ||
* First Published at: http:// | * First Published at: http:// | ||
Line 10: | Line 10: | ||
PHP's native '' | PHP's native '' | ||
- | In order to support both use cases, additional memory is needed to track keys (around twice as much as is needed to just store the values, for non-reference counted values) (https:// | + | In order to support both use cases, additional memory is needed to track keys ([[https:// |
- | It would be useful to have a variable-length container in the standard library | + | |
+ | |||
+ | It would be useful to have an efficient | ||
+ | |||
+ | - To save memory in applications or libraries that may need to store many lists of values | ||
+ | - To provide | ||
+ | - To give users the option of stronger runtime guarantees that property, parameter, or return values really contain a list of values without gaps, that array modifications don't introduce gaps or invalid keys, that values in the collection aren't references, etc. | ||
===== Proposal ===== | ===== Proposal ===== | ||
- | This proposes to add the class '' | + | This proposes to add the class '' |
Similarly to vectors in other languages, this is backed by a memory-efficient representation (raw C array of values with a size and capacity) and provides constant amortized-time push/pop operations. | Similarly to vectors in other languages, this is backed by a memory-efficient representation (raw C array of values with a size and capacity) and provides constant amortized-time push/pop operations. | ||
+ | |||
+ | Similarly to '' | ||
<code php> | <code php> | ||
Line 72: | Line 81: | ||
} | } | ||
</ | </ | ||
+ | |||
+ | ===== Implementation Choices ===== | ||
+ | |||
+ | ==== Global Namespace ==== | ||
+ | |||
+ | This maintains consistency with the namespace used for general-purpose collections already in the SPL (as well as relatively recent additions such as '' | ||
+ | |||
+ | ==== Lack of Name Prefix ==== | ||
+ | |||
+ | - Short names are more convenient to remember/ | ||
+ | - Possible future additions such as a Deque/Queue based on a efficient C array representation rather than a linked list would conflict with existing Spl names such as '' | ||
+ | - There is already an addition to the spl without a prefix - '' | ||
+ | |||
+ | ==== Accepting an iterable ==== | ||
+ | |||
+ | This is similar to the way the existing classes '' | ||
+ | |||
+ | End users may be surprised if integer keys are not the same as the ones passed in by default (e.g. if keys were unset or inserted out of order), which is why '' | ||
+ | |||
+ | Unlike '' | ||
+ | |||
+ | '' | ||
+ | |||
+ | ==== Final Class ==== | ||
+ | |||
+ | If this were extensible, this would have the following drawbacks | ||
+ | |||
+ | - Not have as strong guarantees to readers of code (or even opcache, if optimizations were added targeting opcache) that elements were actually a vector or that certain methods would/ | ||
+ | - Require more memory and runtime checks to check if this was the original class or a subclass. | ||
+ | - [[https:// | ||
+ | |||
+ | ==== push/pop ==== | ||
+ | |||
+ | This is consistent with the name used for '' | ||
+ | |||
+ | Other naming choices were chosen to be consistent with existing functionality in '' | ||
===== Backward Incompatible Changes ===== | ===== Backward Incompatible Changes ===== | ||
- | The class name '' | + | The class name '' |
===== Proposed PHP Version(s) ===== | ===== Proposed PHP Version(s) ===== | ||
Line 88: | Line 133: | ||
===== Benchmarks ===== | ===== Benchmarks ===== | ||
- | This is a contrived benchmark for estimating the performance of building/ | + | This is a contrived benchmark for estimating the performance of building/ |
Read time is counted separately from create+destroy time. This is a total over all iterations, and the instrumentation adds to the time needed. | Read time is counted separately from create+destroy time. This is a total over all iterations, and the instrumentation adds to the time needed. | ||
Line 264: | Line 309: | ||
===== Future Scope ===== | ===== Future Scope ===== | ||
- | In the future, additional methods may be added to '' | + | If '' |
+ | |||
+ | Additional data structures from https:// | ||
===== Proposed Voting Choices ===== | ===== Proposed Voting Choices ===== | ||
Line 273: | Line 320: | ||
- https:// | - https:// | ||
+ | - https:// | ||
+ | - https:// | ||
===== Rejected Features ===== | ===== Rejected Features ===== | ||
+ | |||
+ | ==== Why not use php-ds/ | ||
+ | |||
+ | - No matter how useful or popular a PECL is, datastructures available in PHP's core will have much, much wider adoption in applications and libraries that are available in PECLs, allowing those applications and libraries to write faster and/or more memory efficient code. | ||
+ | - End users can make much stronger assumptions about the backwards compatibility and long-term availability of data structures that are included in core. | ||
+ | - The php-ds maintainers do not plan to merge the extension into php-src, and believe php-ds should coexist with new functionality being added in a separate namespace instead (see quote and [[## | ||
+ | - Opcache may be able to make stronger optimizations of internal classes found in php-src than any third party PECL. (e.g. because '' | ||
+ | |||
+ | === Perceived issues and uncertainties about php-ds distribution plans === | ||
+ | |||
+ | This has been asked about multiple times in threads on unrelated proposals (https:// | ||
+ | but the maintainer of php-ds had a long term goal of developing the separately from php's release cycle (and was still focusing on the PECL when I'd asked on the GitHub issue in the link in September 2020). | ||
+ | |||
+ | To quote the maintainer on the GitHub [[https:// | ||
+ | |||
+ | < | ||
+ | **//My long-term intention has been to not merge this extension into php-src.// I would like to see it become available as a default extension at the distribution level. Unfortunately I have no influence or understanding of that process.** Having an independent release and development cycle is a good thing, in my opinion. | ||
+ | |||
+ | If those plans change, **I would like to hold off until a 2.0 release** - I've learnt a lot over the last 4 years and would like to revisit some of the design decisions I made then, such as a significant reduction of the interfaces or perhaps more interfaces with greater specificity. Functions like '' | ||
+ | |||
+ | I have been working on a research project to design persistent data structures for immutability, | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | > > Do you mean OS distribution level (Windows, Ubuntu, CentOS, HomeBrew for mac, etc.?) | ||
+ | |||
+ | > He meant distribution with PHP core (on all platforms where PHP is available) | ||
+ | |||
+ | Whichever is more viable - simply not merged into core, but distributed and enabled by default alongside it.0 | ||
+ | </ | ||
+ | |||
+ | There have been no proposals from the maintainer themselves so far to add php-ds to core or distribute it alongside core in any form. | ||
+ | That was just what the maintainer mentioned as a long term plan. | ||
+ | |||
+ | The model of distributing an extension separately from core has never been done before, and even if approved would raise multiple concerns: | ||
+ | |||
+ | * I personally doubt having it developed separately from php's release cycle would be accepted by voters (e.g. if unpopular decisions couldn' | ||
+ | * This may limit what features could be added by the community: For example, introducing the '' | ||
+ | * I'm not certain how backwards compatibility would be handled in that model, e.g. if the maintainers of ext-ds wanted to drop support for a method after it was released. | ||
+ | * This may cause delays in publishing php releases, e.g. if the maintainers were unable to quickly review patches for crashes, incompatibilities or compile errors introduced in new php versions, etc. | ||
+ | * and other concerns (e.g. API debates such as https:// | ||
+ | |||
+ | With php-ds itself getting merged anytime soon (if the maintainers continue to plan to distribute php-ds that way) seeming unlikely to me, I decided to start independently working on efficient data structure implementations. | ||
+ | I don't see dragging it in (against the maintainer' | ||
+ | But having efficient datastructures in PHP's core is still useful. | ||
+ | |||
+ | The timeline for php-ds 2.0 is also something I am uncertain about. | ||
+ | |||
+ | < | ||
+ | |||
+ | * //EDIT: I misread the maintainer' | ||
+ | |||
+ | While PECL development outside of php has its benefits for development and ability to make new features available in older php releases, | ||
+ | it's less likely that application and | ||
+ | library authors will start making use of those data structures because many users won't have any given PECL already installed. | ||
+ | (though php-ds also publishes a polyfill, it would not have the cpu and memory savings, and add its own overhead) | ||
+ | |||
+ | Additionally, | ||
+ | backwards compatibility and long-term availability of functionality that is merged into PHP's core. | ||
+ | |||
+ | So the choice of feature set, some names, signatures, and internal implementation details are different, because this is reimplementing a common datastructure found in different forms in many languages. | ||
+ | It's definitely a mature project, but I personally feel like reimplementing this (without referring to the php-ds source code and without copying the entire api as-is) is the best choice to add efficient data structures to core while respecting the maintainer' | ||
+ | |||
+ | As a result, I've been working on implementing data structures such as '' | ||
+ | |||
+ | === Minor differences in API design goals === | ||
+ | |||
+ | Traditionally, | ||
+ | |||
+ | My hopes for ease of use, readability, | ||
+ | |||
+ | < | ||
+ | < | ||
+ | |||
+ | Again, I understand the rationale behind this decision, like reducing duplication and keeping only the core functionality in DS. However, sometimes you have to take into consideration ease of use vs purity of the code. | ||
+ | |||
+ | Ease of use / DX / readability: | ||
+ | |||
+ | '' | ||
+ | |||
+ | Rather than: | ||
+ | |||
+ | '' | ||
+ | |||
+ | Speed: as you said, internal iteration is faster. And speed is one of the selling points of DS vs arrays. | ||
+ | |||
+ | Static analysis: I love the fact that '' | ||
+ | |||
+ | Thank you for your work on DS anyway, I already use the extension in my closed-source project, in particular Map. I would love to use data structures in my open-source projects, one day! 🤞 | ||
+ | </ | ||
+ | |||
+ | Additionally, | ||
+ | |||
+ | ==== Update: php-ds maintainer response clarifications ==== | ||
+ | |||
+ | On September 24, 2021, [[https:// | ||
+ | |||
+ | < | ||
+ | Hi everyone, I am happy to see this discussion and I thank you all for taking part. My reservation to merge ds into core has always been because I wanted to make sure we get it right before we do that and the intention behind the mythical v2 was to achieve that, based on learnings from v1 and feedback from the community. I have no personal attachment to this project, I only want what is best for PHP and the community. | ||
+ | |||
+ | I would love to see a dedicated, super-lean vec data structure in core that has native iteration and all the other same internal benefits as arrays. In my opinion, the API should be very minimal and potentially compatible with all the non-assoc array functions. An OO interface can easily be designed around that. I'm imagining something similar to Golang' | ||
+ | |||
+ | **As for the future of ds itself, I think these can co-exist and ds can remain external. I've been researching and designing immutable data structures over the last 4 years and I still hope to develop a v2 that simplifies the interfaces and introduces immutable structures. Attempting to implement a suite of structures in core or an OO vector would take a lot of work and might be difficult to reach consensus on with the API. I don't think we should attempt to merge ds into core at any time.** | ||
+ | |||
+ | I am currently traveling and have not followed this discussion in detail on the mailing list. I'd be happy to assist in any way I can and will catch up as soon as I am home again this week. Feel free to quote this response on the mailing list as well. | ||
+ | </ | ||
+ | |||
+ | I'm still awaiting some clarifications on how they they were willing to assist before updating the remainder of this RFC. | ||
+ | |||
+ | Additionally, | ||
==== Adding a native type instead (is_vec) ==== | ==== Adding a native type instead (is_vec) ==== | ||
+ | |||
+ | https:// | ||
< | < | ||
Line 295: | Line 456: | ||
See https:// | See https:// | ||
- | That would also require a lot more familiarity than I have with opcache and the JIT assembly compiler, and (I expect it would) be more controversial due to not working with existing code. | + | That would also require a lot more familiarity than I have with opcache and the JIT assembly compiler, and I expect it would be more controversial due to not working with existing code. |
- | For a language such as Hack where feature development is done by one company(Facebook), | + | For a language such as Hack where feature development is done by one company(Facebook), |
Additionally, | Additionally, | ||
- | Also, even if that were done, vec and array would be distinct types - a vec couldn' | + | Also, even if a type '' |
+ | |||
+ | ==== Changelog ==== | ||
+ | |||
+ | 0.2: Add php-ds maintainer response, improve documentation, | ||
rfc/vector.1631801083.txt.gz · Last modified: 2021/09/16 14:04 by tandre