This is an old revision of the document!
PHP RFC: Add str_starts_with() and str_ends_with() functions
- Version: 0.4
- Date: 2020-03-25 (Updated: 2020-04-04)
- Author: Will Hudgins, will@wkhudgins.info
- Status: Under Discussion
- First Published at: https://wiki.php.net/rfc/add_str_starts_with_and_ends_with_functions
Introduction
str_starts_with
checks if a string, s, begins with another string, s′ and returns a boolean value (true
/false
) if s starts with s′.
str_ends_with
checks if a a string, s, ends with another string, s′ and returns a boolean value (true
/false
) if s ends with s′.
Typically this functionality is accomplished by repurposing existing string functions such as 'substr', 'strncmp', 'strpos', or 'substr_compare'. These bespoke userland implementations have various downsides, discussed later in this RFC.
The str_starts_with
and str_ends_with
functionality is so important, and so problematic to leave to userland implementation, that many major PHP frameworks support it including Symfony, Laravel, Yii, FuelPHP, and Phalcon. Please note, the links are for str_starts_with
functionality, but the mentioned frameworks also contain str_ends_with
functionality, often visible on the same web page.
Checking the start and end of strings is a very common task which should be easy. Accomplishing this task is not easy now and that is why many frameworks have chosen to include it. This is also why other programming languages–diverse as JavaScript, Java, Haskell, and Matlab–have also implemented this functionality. Checking the start and end of a string should not be a task which requires pulling in a PHP framework or developing a potentially suboptimal function in userland.
Downsides of Common Userland Approaches
All userland implementations of this functionality suffer from being less intuitive than the proposed functions. This especially true for new PHP developers and developers who frequently switch between PHP and other languages–many of which include this functionality. Additionally, there are performance issues, shown below, with many userland implementations.
str_starts_with
This is memory inefficient because it requires an unnecessary copy of part of $haystack.
$haystack = "string"; $needle = "string2"; strncmp($subject, "prefix", 6) === 0;
This is error prone because it has a hard-coded length for the prefix.
str_ends_with
This is CPU inefficient because it requires reversing both $haystack and $needle as well as applying strpos
. Also, note that if $haystack does not end in $needle, this is extra inefficient because it will search the string for the position of $needle. Also note that if $needle is empty a warning will be raised.
$needle === "" || substr_compare($haystack, $needle, -strlen($needle)) === 0)
This is more efficient than the prior example but exceptionally verbose and not immediately intuitive. Note that $needle === “”
is required in order to ensure that empty $needles are handled appropriately.
Proposal
Add two new basic functions: str_starts_with
and str_ends_with
:
str_starts_with ( string $haystack , string $needle ) : bool str_ends_with ( string $haystack , string $needle ) : bool
str_starts_with()
checks if $haystack
begins with $needle
. If $needle
is longer than $haystack
, it returns false
; else, it compares each character in $needle
with the corresponding character in $haystack
(aligning both strings at their start), returning false
if it encounters a mismatch, and true
otherwise.
str_ends_with()
does the same thing but aligning both strings at their end.
Examples below:
$str = "beginningMiddleEnd"; if (str_starts_with($str, "beg")) echo "printed\n"; if (str_starts_with($str, "Beg")) echo "not printed\n"; if (str_ends_with($str, "End")) echo "printed\n"; if (str_ends_with($str, "end")) echo "not printed\n"; // empty strings: if (str_starts_with("a", "")) echo "printed\n"; if (str_starts_with("", "")) echo "printed\n"; if (str_starts_with("", "a")) echo "not printed\n"; if (str_ends_with("a", "")) echo "printed\n"; if (str_ends_with("", "")) echo "printed\n"; if (str_ends_with("", "a")) echo "not printed\n";
Please note, the behavior concerning empty strings is in accordance with the behavior of the accepted str_contains RFC. This behavior is also the same as is common with other languages, including Java and Python.
Backward Incompatible Changes
This could break functions existing in userland with the same names. But see the corresponding section in the str_contains RFC for a discussion illustrating how this concern may be mitigated and why this concern does not justify the rejection of this RFC.
Proposed PHP Version(s)
PHP 8
RFC Impact
- To SAPIs: Will add the aforementioned functions to all PHP environments.
- To Existing Extensions: None.
- To Opcache: No effect.
- New Constants: No new constants.
- php.ini Defaults: No changed php.ini settings.
Patches and Tests
Implementation
After the project is implemented, this section should contain
- the version(s) it was merged to
- a link to the git commit(s)
- a link to the PHP manual entry for the feature
References
- Implementation of similar methods/functions in other languages:
- JavaScript: String#startsWith() and String#endsWith()
- Python: str#startswith() and str#endswith()
- Java: String#startsWith() and String#endsWith() (and Apache Commons Lang StringUtils.startsWith() and StringUtils.endsWith())
- Ruby: String#start_with?() and String#end_with?()
- Haskell: Data.String.Utils.startswith and Data.String.Utils.endswith (aliases of Data.List.isPrefixOf and Data.List.isSuffixOf)
- MATLAB: startsWith() and endsWith()
- Accepted RFC for related function: PHP RFC: str_contains
- Rejected Prior RFC: PHP RFC: rfc:add_str_begin_and_end_functions
- Discussion on the php.internals mailing list: https://externals.io/message/109318
Rejected Features
- Case-insensitive and multibyte variants were included in the previous version of this RFC, which was declined. See also the related section in the str_contains RFC.