rfc:negative-string-offsets
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
rfc:negative-string-offsets [2016/01/23 15:12] – Initial release francois | rfc:negative-string-offsets [2017/09/22 13:28] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== PHP RFC: Generalize support of negative string offsets ====== | ====== PHP RFC: Generalize support of negative string offsets ====== | ||
- | * Version: 1.0 | + | * Version: 1.3 |
- | * Date: 2016-01-23 | + | * Date: 2016-02-18 |
* Author: François Laupretre < | * Author: François Laupretre < | ||
- | * Status: | + | * Status: |
* First Published at: http:// | * First Published at: http:// | ||
===== Introduction ===== | ===== Introduction ===== | ||
- | In most PHP functions, providing a negative value as string offset means 'count n positions from the end of the string' | + | In most PHP functions, providing a negative value as string offset means '//n// positions |
- | This mechanism is widely used but, unfortunately, | + | This mechanism is widely used but, unfortunately, |
- | So, developers | + | So, as PHP developers |
- | If it does not, they need to insert a call to substr(), making their code less readable and slower. | + | they regularly need to refer to the documentation. |
+ | If it does not, they need to insert a call to the substr() | ||
An obvious example is strrpos() accepting negative offsets, while strpos() does not. | An obvious example is strrpos() accepting negative offsets, while strpos() does not. | ||
Line 19: | Line 20: | ||
This RFC proposes to generalize support for negative string offsets everywhere it makes sense. | This RFC proposes to generalize support for negative string offsets everywhere it makes sense. | ||
- | In the same spirit, it also adds support for negative lengths where it is missing. | ||
- | The main objective | + | In accordance with the existing behavior, a negative offset |
- | ====== Feature requests solved by this RFC ====== | + | In the same spirit, the RFC also adds support for negative length arguments where it makes sense. |
+ | A negative length (-//x//) means 'up to the position //x// counted backwards from the end of the string' | ||
- | * [[https://bugs.php.net/bug.php? | + | The reference behavior for negative offset and length is the [[http://php.net/manual/en/function.substr.php|substr()]] function. |
- | * [[https://bugs.php.net/bug.php?id=36524|36524]] | + | |
- | ====== Detailed changes ====== | + | The main objective of this RFC is to improve the overall consistency of the language. |
- | * Negative string offsets in read mode ($xx = $str[-2] or $str{-2}); | + | ==== Feature requests solved by this RFC ==== |
- | * Negative string offsets in assignments ($str{-2} | + | * https:// |
+ | * https:// | ||
- | * Negative string offsets in isset()/ | + | ===== Detailed changes ===== |
- | * strpos() : negative offset | + | ==== In the language ==== |
- | * stripos() : negative | + | String access to individual characters using a ' |
- | * substr_count(): negative ' | + | Examples |
- | * grapheme_strpos() : negative offset | + | <code php> |
+ | $str=' | ||
+ | var_dump($str[-2]); // => string(1) " | ||
- | * grapheme_stripos(), grapheme_extract() : negative offset | + | $str{-3}=' |
+ | var_dump($str); // => string(6) " | ||
- | * grapheme_extract() : negative offset | + | var_dump(isset($str{-4})); |
- | * iconv_strpos() : negative offset | + | var_dump(isset($str{-10})); |
+ | </ | ||
- | * file_get_contents(): | + | ==== In built-in functions ==== |
- | * mb_strimwidth() : negative | + | ^ Function ^ Add support for ^ |
+ | | strpos | Negative offset | | ||
+ | | stripos | Negative offset | | ||
+ | | substr_count| Negative ' | ||
+ | | grapheme_strpos | Negative offset | | ||
+ | | grapheme_stripos | Negative offset | | ||
+ | | grapheme_extract | Negative offset | | ||
+ | | iconv_strpos | Negative offset | | ||
+ | | file_get_contents| Negative offset (based on seek(SEEK_END), | ||
+ | | mb_strimwidth | Negative | ||
+ | | mb_ereg_search_setpos| Negative ' | ||
+ | | mb_strpos| Negative offset | | ||
+ | | mb_stripos| Negative offset | | ||
- | * mb_ereg_search_setpos(): | + | ==== Notes ==== |
- | * mb_strpos(): negative | + | * Nothing done for iconv_strrpos() because function does not accept an 'offset' arg (inconsistent with every other xxx_strrpos() functions but argument cannot be added without breaking BC). |
- | * mb_stripos(): negative | + | * file_get_contents() : ' |
===== Backward Incompatible Changes ===== | ===== Backward Incompatible Changes ===== | ||
This RFC extends the range of valid values. | This RFC extends the range of valid values. | ||
- | In most cases, negative values raise a warning message and offset | + | In most cases, negative values raise a warning message and offset |
The new behavior considers such values as valid (as long as they don't exceed the string length). | The new behavior considers such values as valid (as long as they don't exceed the string length). | ||
- | I consider these BC breaks as minor because, everywhere the behavior is modified, negative values were considered as invalid. | + | While not negligible, |
- | So, we are replacing | + | So, we are just suppressing |
===== Proposed PHP Version(s) ===== | ===== Proposed PHP Version(s) ===== | ||
Line 88: | Line 105: | ||
===== Open Issues ===== | ===== Open Issues ===== | ||
- | None | + | To do (waiting for RFC approval) : |
+ | |||
+ | * Update documentation | ||
+ | * Update language specifications | ||
===== Unaffected PHP Functionality ===== | ===== Unaffected PHP Functionality ===== | ||
- | mbstring functions remain compatible with ASCII functions, relative to the mbstring.func_overload ini setting. | + | mbstring functions remain compatible with their ASCII counterpart, relative to the mbstring.func_overload ini setting. |
===== Future Scope ===== | ===== Future Scope ===== | ||
- | None | + | ==== Recommend using ' |
+ | |||
+ | It was suggested during the discussion that, since array access and string | ||
+ | offsets are very different operations, the official documentation should | ||
+ | recommend using the ' | ||
+ | ' | ||
+ | |||
+ | On the opposite side, it was also suggested that array access and string offsets | ||
+ | are so closely-related concepts that we should recommend using ' | ||
+ | cases and disable the alternate ' | ||
+ | |||
+ | So, as the subject is controversial and very tangential to the subject of this RFC, | ||
+ | it will be left for a future RFC. | ||
===== Proposed Voting Choices ===== | ===== Proposed Voting Choices ===== | ||
As this RFC adds support for negative string offsets in the language, it requires a 2/3 majority. | As this RFC adds support for negative string offsets in the language, it requires a 2/3 majority. | ||
+ | |||
+ | <doodle title=" | ||
+ | * Yes | ||
+ | * No | ||
+ | </ | ||
===== Patches and Tests ===== | ===== Patches and Tests ===== |
rfc/negative-string-offsets.1453561941.txt.gz · Last modified: 2017/09/22 13:28 (external edit)