rfc:add_str_starts_with_and_ends_with_functions

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
rfc:add_str_starts_with_and_ends_with_functions [2020/04/04 17:19] wkhudgins92rfc:add_str_starts_with_and_ends_with_functions [2020/05/05 14:12] (current) nikic
Line 1: Line 1:
 ====== PHP RFC: Add str_starts_with() and str_ends_with() functions ====== ====== PHP RFC: Add str_starts_with() and str_ends_with() functions ======
-  * Version: 0.4 +  * Version: 0.5 
-  * Date: 2020-03-25 (**Updated**: 2020-04-04)+  * Date: 2020-03-25 (**Updated**: 2020-05-05)
   * Author: Will Hudgins, will@wkhudgins.info   * Author: Will Hudgins, will@wkhudgins.info
-  * Status: Under Discussion+  * Status: Implemented
   * First Published at: https://wiki.php.net/rfc/add_str_starts_with_and_ends_with_functions   * First Published at: https://wiki.php.net/rfc/add_str_starts_with_and_ends_with_functions
  
 ===== Introduction ===== ===== Introduction =====
-''str_starts_with'' checks if a string, //s//, begins with another string, //s′// and returns a boolean value (''true''/''false''if //s// starts with //s′//.  +''str_starts_with'' checks if a string begins with another string and returns a boolean value (''true''/''false''whether it does.\\ 
-''str_ends_with'' checks if a string, //s//, ends with another string, //s′// and returns a boolean value (''true''/''false''if //s// ends with //s′//+''str_ends_with'' checks if a string ends with another string and returns a boolean value (''true''/''false''whether it does.
  
-Typically this functionality is accomplished by repurposing existing string functions such as  '[[http://php.net/manual/en/function.substr.php|substr]]''[[http://php.net/manual/en/function.strncmp.php|strncmp]]''[[http://php.net/manual/en/function.strpos.php|strpos]]', or '[[http://php.net/manual/en/function.substr_compare.php|substr_compare]]'. These bespoke userland implementations have various downsides, discussed later in this RFC.+Typically this functionality is accomplished by using existing string functions such as <php>substr</php><php>strpos</php>/<php>strrpos</php><php>strncmp</php>, or <php>substr_compare</php> (often combined with <php>strlen</php>). These bespoke userland implementations have various downsides, discussed further below.
  
-The ''str_starts_with'' and ''str_ends_with'' functionality is so important, and so problematic to leave to userland implementation, that many major PHP frameworks support it including [[https://symfony.com/doc/current/components/string.html#usage|Symfony]], [[https://laravel.com/docs/7.x/helpers#method-starts-with|Laravel]], [[https://www.yiiframework.com/doc/api/2.0/yii-helpers-basestringhelper#startsWith()-detail|Yii]], [[https://fuelphp.com/docs/classes/str.html#/method_starts_with|FuelPHP]], and [[https://docs.phalcon.io/3.4/en/api/phalcon_text|Phalcon]]. Please note, the links are for ''str_starts_with'' functionality, but the mentioned frameworks also contain ''str_ends_with'' functionality, often visible on the same web page.+The ''str_starts_with'' and ''str_ends_with'' functionality is so commonly needed that many major PHP frameworks support itincluding [[https://symfony.com/doc/5.0/components/string.html#methods-to-search-and-replace|Symfony]], [[https://laravel.com/docs/7.x/helpers#method-starts-with|Laravel]], [[https://www.yiiframework.com/doc/api/2.0/yii-helpers-basestringhelper#startsWith()-detail|Yii]], [[https://fuelphp.com/docs/classes/str.html#/method_starts_with|FuelPHP]], and [[https://docs.phalcon.io/3.4/en/api/phalcon_text|Phalcon]] ((some of those links are for ''str_starts_with'' functionality, but the mentioned frameworks also contain ''str_ends_with'' functionality, often visible on the same web page)).
  
-Checking the start and end of strings is a very common task which should be easy. Accomplishing this task is not easy now and that is why many frameworks have chosen to include it. This is also why other programming languagesdiverse as JavaScript, Java, Haskell, and Matlabhave also implemented this functionality. Checking the start and end of a string should not be a task which requires pulling in a PHP framework or developing a potentially suboptimal function in userland. +Checking the start and end of strings is a very common task which should be easy. Accomplishing this task is not easy now and that is why many frameworks have chosen to include it. This is also why other high-level programming languages---as diverse as JavaScript, Java, Haskell, and Matlab---have implemented this functionality. Checking the start and end of a string should not be a task which requires pulling in a PHP framework or developing a potentially suboptimal (or worse, buggy) function in userland. 
 + 
 +==== Downsides of Common Userland Approaches ===== 
 +Ad hoc userland implementations of this functionality are __less intuitive__ than dedicated functions (this is especially true for new PHP developers and developers who frequently switch between PHP and other languages---many of which include this functionality natively).\\ 
 +The implementation is also __easy to get wrong__ (especially with the ''==='' comparison).\\ 
 +Additionally, there are __performance issues__ with many userland implementations. 
 + 
 +//Note: some implementations add "//<php>$needle === "" || </php>//" and/or "//<php>strlen($needle) <= strlen($haystack) && </php>//" guards to handle empty needle values and/or avoid warnings.//
  
-===== Downsides of Common Userland Approaches ===== 
-All userland implementations of this functionality suffer from being less intuitive than specific standard functions. This especially for new PHP developers and developers who frequently switch between PHP and other languages, many of which include this functionality. Additionally, there are performance, shown below, with many userland implementations. 
-  
 === str_starts_with === === str_starts_with ===
 <PHP> <PHP>
 substr($haystack, 0, strlen($needle)) === $needle substr($haystack, 0, strlen($needle)) === $needle
 </PHP> </PHP>
-This is memory inefficient because it requires an unnecessary copy of part of $haystack.+This is memory inefficient because it requires an unnecessary copy of part of the haystack.
  
 <PHP> <PHP>
-$haystack = "string"; +strpos($haystack$needle) === 0
-$needle = "string2"; +
-strncmp($subject, "prefix", 6) === 0;+
 </PHP> </PHP>
-This is error prone because it has a hard-coded length for the prefix.+This is potentially CPU inefficient because it will unnecessarily search along the whole haystack if it doesn't find the needle. 
 + 
 +<PHP> 
 +strncmp($haystack, $needle, strlen($needle)) === 0 // generic 
 +strncmp($subject, "prefix", 6) === 0 // ad hoc 
 +</PHP> 
 +This is efficient but requires providing the needle length as separate argument, which is either verbose (repeat "''$needle''") or error prone (hard-coded number).
  
 === str_ends_with === === str_ends_with ===
 +<PHP>
 +substr($haystack, -strlen($needle)) === $needle
 +</PHP>
 +This is memory inefficient (see above).
 +
 <PHP> <PHP>
 strpos(strrev($haystack), strrev($needle)) === 0 strpos(strrev($haystack), strrev($needle)) === 0
 </PHP> </PHP>
-This is CPU inefficient because it requires reversing both $haystack and $needle as well as applying ''strpos''. Also, note that if $haystack does not end in $needle, this is extra inefficient because it will search the string for the position of $needle. Also note that if $needle is empty a warning will be raised.+This is doubly inefficient because it requires reversing both the haystack and the needle as well as applying ''strpos'' (see above).
  
 <PHP> <PHP>
-$needle ===  "" || substr_compare($haystack, $needle, -strlen($needle)) === 0)+strrpos($haystack, $needle=== strlen($haystack- strlen($needle)
 </PHP> </PHP>
-This is more efficient than the prior example but exceptionally verbose and not immediately intuitiveNote that ''$needle === ""'' is required in order to ensure that empty $needles are handled appropriately.+This is verbose and also potentially CPU inefficient. 
 + 
 +<PHP> 
 +substr_compare($haystack, $needle, -strlen($needle)) === 0 // generic 
 +substr_compare($subject, "suffix", -6) === 0 // ad hoc 
 +</PHP> 
 +This is efficient but either verbose or error prone (see ''strncmp'' above).
  
 ===== Proposal ===== ===== Proposal =====
Line 70: Line 89:
 if (str_ends_with("", "")) echo "printed\n"; if (str_ends_with("", "")) echo "printed\n";
 if (str_ends_with("", "a")) echo "not printed\n"; if (str_ends_with("", "a")) echo "not printed\n";
- </PHP>+</PHP>
  
-Please note, the behavior concerning empty strings is in accordance with the behavior of the accepted [[rfc:str_contains|str_contains RFC]]. This behavior is also the same as is common with other languages, including Java and Python.+Note: the behavior concerning empty strings is in accordance with what is described in the accepted [[rfc:str_contains#proposal|str_contains RFC]]. This behavior is also the same as is common with other languages, including Java and Python.
  
 ===== Backward Incompatible Changes ===== ===== Backward Incompatible Changes =====
-This could break functions existing in userland with the same names. But see [[rfc:str_contains#backward_incompatible_changes|the corresponding section in the str_contains RFC]] for a discussion illustrating how this concern may be mitigated and why this concern does not justify the rejection of this RFC.+This could break functions existing in userland with the same names. But see [[rfc:str_contains#backward_incompatible_changes|the corresponding section in the str_contains RFC]] for a discussion illustrating how this concern may be mitigated and why it does not justify the rejection of this RFC.
  
 ===== Proposed PHP Version(s) ===== ===== Proposed PHP Version(s) =====
Line 86: Line 105:
   * **New Constants:** No new constants.   * **New Constants:** No new constants.
   * **php.ini Defaults:** No changed php.ini settings.   * **php.ini Defaults:** No changed php.ini settings.
 +
 +===== Votes =====
 +Voting closes 2020-05-04
 +
 +<doodle 
 +title="Add str_starts_with and str_ends_with as described" auth="wkhudgins92" voteType="single" closed="true">
 +   * yes
 +   * no
 +</doodle>
 +
  
 ===== Patches and Tests ===== ===== Patches and Tests =====
Line 91: Line 120:
  
 ===== Implementation ===== ===== Implementation =====
-After the project is implemented, this section should contain +After the project is implemented, this section should contain
   - the version(s) it was merged to   - the version(s) it was merged to
   - a link to the git commit(s)   - a link to the git commit(s)
Line 102: Line 131:
     * Python: [[https://docs.python.org/3/library/stdtypes.html#str.startswith|str#startswith()]] and [[https://docs.python.org/3/library/stdtypes.html#str.endswith|str#endswith()]]     * Python: [[https://docs.python.org/3/library/stdtypes.html#str.startswith|str#startswith()]] and [[https://docs.python.org/3/library/stdtypes.html#str.endswith|str#endswith()]]
     * Java: [[https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#startsWith(java.lang.String)|String#startsWith()]] and [[https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#endsWith(java.lang.String)|String#endsWith()]] (and Apache Commons Lang [[https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#startsWith-java.lang.CharSequence-java.lang.CharSequence-|StringUtils.startsWith()]] and [[https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#endsWith-java.lang.CharSequence-java.lang.CharSequence-|StringUtils.endsWith()]])     * Java: [[https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#startsWith(java.lang.String)|String#startsWith()]] and [[https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#endsWith(java.lang.String)|String#endsWith()]] (and Apache Commons Lang [[https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#startsWith-java.lang.CharSequence-java.lang.CharSequence-|StringUtils.startsWith()]] and [[https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#endsWith-java.lang.CharSequence-java.lang.CharSequence-|StringUtils.endsWith()]])
-    * Ruby: [[http://ruby-doc.org/core-2.1.1/String.html#method-i-start_with-3F|String#start_with?()]] and [[http://ruby-doc.org/core-2.1.1/String.html#method-i-end_with-3F|String#end_with?()]]+    * Ruby: [[https://ruby-doc.org/core-2.1.1/String.html#method-i-start_with-3F|String#start_with?()]] and [[https://ruby-doc.org/core-2.1.1/String.html#method-i-end_with-3F|String#end_with?()]]
     * Go: [[https://golang.org/pkg/strings/#HasPrefix|strings.HasPrefix()]] and [[https://golang.org/pkg/strings/#HasSuffix|strings.HasSuffix()]]     * Go: [[https://golang.org/pkg/strings/#HasPrefix|strings.HasPrefix()]] and [[https://golang.org/pkg/strings/#HasSuffix|strings.HasSuffix()]]
     * Haskell: [[https://hackage.haskell.org/package/MissingH-1.4.0.1/docs/Data-String-Utils.html#v:startswith|Data.String.Utils.startswith]] and [[https://hackage.haskell.org/package/MissingH-1.4.0.1/docs/Data-String-Utils.html#v:endswith|Data.String.Utils.endswith]] (aliases of [[https://hackage.haskell.org/package/base-4.12.0.0/docs/Data-List.html#v:isPrefixOf|Data.List.isPrefixOf]] and [[https://hackage.haskell.org/package/base-4.12.0.0/docs/Data-List.html#v:isSuffixOf|Data.List.isSuffixOf]])     * Haskell: [[https://hackage.haskell.org/package/MissingH-1.4.0.1/docs/Data-String-Utils.html#v:startswith|Data.String.Utils.startswith]] and [[https://hackage.haskell.org/package/MissingH-1.4.0.1/docs/Data-String-Utils.html#v:endswith|Data.String.Utils.endswith]] (aliases of [[https://hackage.haskell.org/package/base-4.12.0.0/docs/Data-List.html#v:isPrefixOf|Data.List.isPrefixOf]] and [[https://hackage.haskell.org/package/base-4.12.0.0/docs/Data-List.html#v:isSuffixOf|Data.List.isSuffixOf]])
Line 113: Line 142:
  
 ===== Rejected Features ===== ===== Rejected Features =====
-  * Case-insensitive and multibyte variants were included in the previous version of this RFC, which was declined. See also [[rfc:str_contains#case-insensitivity_and_multibyte_strings|the related section in the str_contains RFC]].+  * **Case-insensitive** and **multibyte** variants were included in the previous version of this RFC, which was declined. See also [[rfc:str_contains#case-insensitivity_and_multibyte_strings|the related section in the str_contains RFC]].
rfc/add_str_starts_with_and_ends_with_functions.1586020765.txt.gz · Last modified: 2020/04/04 17:19 by wkhudgins92