rfc:array_part
no way to compare when less than two revisions

Differences

This shows you the differences between two versions of the page.


Previous revision
Next revision
rfc:array_part [2012/05/28 15:19] – [Specification for the function] cataphract
Line 1: Line 1:
 +====== Request for Comments: array_part ======
 +  * Version: 1.1
 +  * Date: 2012-05-14
 +  * Author: Gustavo Lopes <cataphract@php.net>
 +  * Status: Under Discussion
 +  * First Published at: http://wiki.php.net/rfc/array_part
 +
 +
 +Introduces a new function ''array_part()'' in ext/standard.
 +
 +===== Introduction =====
 +
 +This RFC proposes a new array function that can extract multidimensional slices for arrays.
 +
 +===== Specification for the function =====
 +
 +The function ''array_part()'' shall have the following signature:
 +
 +  array array_part(array $originalArray, array $partSpecification[, bool $indexesAreKeys = false])
 +
 +The parameter ''$originalArray'' represents the original array from which a part is to be extracted.
 +
 +The parameter ''$partSpecification'' is a sequentially indexed numeric array that specifies the slice to extract at each "level" of the original array. A level //n// (with //n// >= 0) of an array ''$arr'' is the set of array elements than can be fetched by "dereferencing" the array //n// times. For instance, the array ''%%[[1],[['a','b']]]%%'' has at level 0 itself, at level 1 the elements ''%%[1]%%'' and ''%%[['a','b']]%%'' and at level 2 the elements ''1'' and ''%%['a','b']%%''. ''%%$partSpecification[0]%%'' shall give the part specification for level 1, and, in, general, ''%%$partSpecification[m]%%'' for level //m + 1//.
 +
 +Each part specification shall have one of the following forms:
 +
 +  * A non-empty sequential numeric array of indexes, which specifies that only the elements existing at those indexes will be kept.
 +  * A single index, which specifies that only the element existing at that index will be kept. In this case the level will be collapsed onto the previous one, meaning all the arrays at that level will be replaced with its element the specified index.
 +  * A span part specification is an associative array. The following keys are allowed //start//, //end// and //step//. At least //start// or //end// must be specified. //start// and //end// are an index or a special value -- ''null''. //step// is a non-zero integer, defaulting to ''1''. If ''start'' or ''end'' are not specified, they default to ''null'', which refer to either the first or last element depending on the sign of //step//. If //step// is +-1 and //start// and //end// are both ''null'' (or, if ''$indexesAreKeys'' is ''false'', //start// and //end// are //0// and //-1// or vice-versa depending on the sign of //step//), then we say the span encompasses all elements on that level (possibly 0).
 +
 +Span parts extract any number of elements (possibly 0) starting at the index specified by //start// and advancing until //end// is reached, advancing in steps of //step//. The element at the index specified at //end// is included.
 +
 +
 +An index //i// has the following meaning:
 +  * If ''$indexesAreKeys'' is ''false'' and //i// is a non-negative integer (possibly after casting) then an element is at index //i// if it //i// is the the number of times one would have to call ''next()'' after calling ''reset()'' for ''current()'' to return that element.
 +  * If ''$indexesAreKeys'' is ''false'' and //i// is a negative integer, then an element is at index //i// if //-i-1// is the number of times one would have to call ''prev()'' after calling ''end()'' for ''current()'' to return that element.
 +  * If ''$indexesAreKeys'' is ''true'' then an element is at //i// if its array key is //i//.
 +
 +The keys in the original array are not preserved in the output, expect at levels that were not visited. References are not preserved when the elements are added to the resulting array.
 +
 +This function shall return ''false'' upon finding error conditions. The following are error conditions:
 +
 +  * Giving arguments with different PHP types from those specified.
 +  * Giving malformed part specifications (where the individual elements do not follow the syntactic rules given here).
 +  * Specifying an part index than does not exist in at least one of the elements at the level the part refers to: (e.g. ''%%array_part([[1],[1,2]],   [['start'=>0, 'end'=>-1], 1])%%'' is an error condition, because while ''1'' is a valid index for ''%%[1,2]%%'', it is not so for ''%%[[1]]%%''). However, span part specifications that comprise all elements are always accepted.
 +  * Giving a part specification with levels that do not exist in the input. This does not apply if, as a result of a span part that comprises all the elements of that level and at the level before existed only empty arrays, no elements were left for the next level. For instance '%%array_part([], [['start'=>0, 'end'=>-1]])%%' is valid, as is '%%array_part([[]], [0, ['start'=>0, 'end'=>-1], 1, 2, 3])%%', which will return ''%%[[]]%%''.
 +
 +===== Proposed implementation =====
 +
 +The proposed implementation is available as [[https://github.com/cataphract/php-src/tree/array_part|git branch on github]]. I'll update the branch it as I improve it. This is not a prototype. If this proposal is accepted, this implementation will be merged.
 +
 +===== Sample PHP implementation and usage =====
 +
 +A sample implementation, with tests exemplifying the use of this function, [[https://gist.github.com/2660601|is available]].
 +
 +This implementation differs in some respects to the internal implementation with respect to behavior. Its purpose is only to exemplify the usage of the function here proposed.
 +
 +[[https://gist.github.com/2660601#file_test.php|Direct link to the tests]]
 +
 +===== Comments =====
 +I would find this far more useful if the keys were preserved, or at least an option to do so.
 +Or the old standby that INT keys get renumbered but non-INT do not.
 +
 +
 +===== Objections =====
 +
 +**I want native array slicing through a new operator**
 +
 +That's fine, but it is not what this proposal is about. A change to the language presents many more complications which I very much want to avoid -- for instance the current array dereferencing syntax most likely would unusable because -1 represents the element with key -1, not the last element. Besides, the introduction of this function does not prevent native array slicing from being added in the future.
 +
 +**You seem to already have an implementation, why are you pushing this to us?**
 +
 +First, this function would benefit greatly from a native implementation. It's impossible to efficiently detect recursion in userland. The sample makes heavy use of references, which introduces a lot of separations. Traversing arrays without changing the internal pointer is very inefficient in userland.
 +
 +Of course, the main reason is that this function is useful. Recently, functions like ''array_column()'', ''array_first()'' and ''array_last()'' have been proposed. This function satisfies all those needs: ''%%array_part($arr, [['start'=>null], 'column'], true)%%'', ''array_part($arr, 0)'' and ''array_part($arr, -1)''.
 +
 +**The $indexesAreKeys = false mode is inefficient because there's no constant time access to the n-th element**
 +
 +This is true. We must traverse the array from the start or the end to get to the n-th element. That's just the way PHP arrays are implemented. However, if you have numeric sequential arrays, you can use ''$indexesAreKeys = true'' to access the n-th element without this penalty (also note that the sample implementation is **not** optimized).
 +
 +===== Vote =====
 +
 +<doodle 
 +title="Should the current array_part() implementation be merged" auth="cataphract" voteType="single" closed="False">
 +   * Yes
 +   * No
 +</doodle>
 +
 +===== Changelog =====
 +
 +  * 2012-05-14 Initial version
 +  * 2012-05-21 Dropped recursion restriction, added note on how references are preserved, added link to native implementation
 +  * 2012-05-21 References are not preserved after all
 +  * 2012-05-28 Vote opened
  
rfc/array_part.txt · Last modified: 2017/09/22 13:28 by 127.0.0.1