rfc:mb_trim

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
rfc:mb_trim [2023/10/18 06:16] – modify title youkidearitairfc:mb_trim [2024/04/15 08:40] (current) – old revision restored (2023/11/24 06:26) youkidearitai
Line 2: Line 2:
   * Version: 0.1   * Version: 0.1
   * Date: 2023-10-18   * Date: 2023-10-18
-  * Author: Yuya Hamada (youkidearitai), youkidearitai@gmail.com +  * Author: Yuya Hamada (https://github.com/youkidearitai), youkidearitai@gmail.com based on 8ctopus(https://github.com/8ctopus), hello@octopuslabs.io 
-  * Status: Draft+  * Status: Implemented
   * First Published at: http://wiki.php.net/rfc/mb_trim   * First Published at: http://wiki.php.net/rfc/mb_trim
  
-This is a suggested template for PHP Request for Comments (RFCs). Change this template to suit your RFC Not all RFCs need to be tightly specified Not all RFCs need all the sections below. +===== Introduction ===== 
-Read https://wiki.php.net/rfc/howto carefully!+PHP does not have a multibyte equivalent of the trim function. It is possible to get close enough behavior using preg_replace("/^\s+|\s+$/u", '', $string), however adding a pre-built function to do this will improve the readability and clarity of PHP codeIt will also standardize how it is done as it can be trickyThis feature would be of use to many PHP developers with varying levels of experience and would complete the mbstring extension.
  
 +One of use case is "trim Byte Order Mark". I think mb_ltrim would be work:
  
-Quoting [[http://news.php.net/php.internals/71525|Rasmus]]:+<code> 
 +mb_ltrim($string, "\u{FEFF}\u{FFFE}"); 
 +</code>
  
-> PHP is and should remain: +===== Proposal ===== 
-> 1a pragmatic web-focused language +Add mb_trim(function:
-> 2) a loosely typed language +
-> 3) a language which caters to the skill-levels and platforms of a wide range of users+
  
-Your RFC should move PHP forward following his vision. As [[http://news.php.net/php.internals/66065|said by Zeev Suraski]] "Consider only features which have significant traction to a +<code> 
-large chunk of our userbase, and not something that could be useful in some +function mb_trim(string $string, string $characters = " \f\n\r\t\v\x00\u{00A0}\u{1680}\u{2000}\u{2001}\u{2002}\u{2003}\u{2004}\u{2005}\u{2006}\u{2007}\u{2008}\u{2009}\u{200A}\u{2028}\u{2029}\u{202F}\u{205F}\u{3000}\u{0085}\u{180E}")string 
-extremely specialized edge cases [...] Make sure you think about the full contextthe huge audience out therethe consequences of  making the learning curve steeper with +</code> 
-every new featureand the scope of the goodness that those new features bring."+<code> 
 +function mb_ltrim(string $stringstring $characters = " \f\n\r\t\v\x00\u{00A0}\u{1680}\u{2000}\u{2001}\u{2002}\u{2003}\u{2004}\u{2005}\u{2006}\u{2007}\u{2008}\u{2009}\u{200A}\u{2028}\u{2029}\u{202F}\u{205F}\u{3000}\u{0085}\u{180E}"?string $encoding = null): string {} 
 +</code> 
 +<code> 
 +function mb_rtrim(string $stringstring $characters = \f\n\r\t\v\x00\u{00A0}\u{1680}\u{2000}\u{2001}\u{2002}\u{2003}\u{2004}\u{2005}\u{2006}\u{2007}\u{2008}\u{2009}\u{200A}\u{2028}\u{2029}\u{202F}\u{205F}\u{3000}\u{0085}\u{180E}", ?string $encoding = null): string {} 
 +</code>
  
-===== Introduction ===== +Here'the list of characters trimmed:
-The elevator pitch for the RFC. The first paragraph of this section will be slightly larger to give it emphasis; please write a good introduction.+
  
-===== Proposal ===== +Same as trim: 
-All the features and examples of the proposal.+<code> 
 +U+0020 SPACE (also in Separator category) 
 +U+0009 \t 
 +U+000A \n 
 +U+000B \v 
 +U+000D \r 
 +</code>
  
-To [[http://news.php.net/php.internals/66051|paraphrase Zeev Suraski]], explain hows the proposal brings substantial value to be considered +not removed in trim(), probably it wasn't common enough, but ok for mb_trim 
-for inclusion in one of the world's most popular programming languages.+<code> 
 +U+000C \f 
 +</code> 
 + 
 +Removed in trim, but not included in regex \s 
 +<code> 
 +U+0000 \0 
 +</code> 
 + 
 +whole Separator Z category (20 codepoints) covered by regex \s: 
 +<code> 
 +U+0020 SPACE 
 +U+00A0 NO-BREAK SPACE 
 +U+1680 OGHAM SPACE MARK 
 +U+2000 EN QUAD 
 +U+2001 EM QUAD 
 +U+2002 EN SPACE 
 +U+2003 EM SPACE 
 +U+2004 THREE-PER-EM SPACE 
 +U+2005 FOUR-PER-EM SPACE 
 +U+2006 SIX-PER-EM SPACE 
 +U+2007 FIGURE SPACE 
 +U+2008 PUNCTUATION SPACE 
 +U+2009 THIN SPACE 
 +U+200A HAIR SPACE 
 +U+2028 LINE SEPARATOR 
 +U+2029 PARAGRAPH SEPARATOR 
 +U+202F NARROW NO-BREAK SPACE 
 +U+205F MEDIUM MATHEMATICAL SPACE 
 +U+3000 IDEOGRAPHIC SPACE 
 +</code> 
 + 
 +Other symbols (included in regex \s): 
 +<code> 
 +U+0085 NEXT LINE (NEL) 
 +U+180E MONGOLIAN VOWEL SEPARATOR 
 +</code> 
 + 
 +On the other hand, The ".." notation for $characters that was in the trim function was not supportedex: \u{0000}..\u{FFFF} 
 +Because the reason is below: 
 + 
 +  * Unicode character is very wide 
 +    *  Difficult to search 
 +    * Difficult to store in memory 
 +    * Mapping with other character codes may be incompatible 
 +      * For example, to express Hiragana, UTF-8 uses [あ-ゞ], EUC-JP [あ-ゝゞ], and Shift_JIS [あ-ん].
  
-Remember that the RFC contents should be easily reusable in the PHP Documentation. 
  
-If applicable, you may wish to use the language specification as a reference. 
  
 ===== Backward Incompatible Changes ===== ===== Backward Incompatible Changes =====
-What breaks, and what is the justification for it?+This could break a function existing in userland with the same name.
  
 ===== Proposed PHP Version(s) ===== ===== Proposed PHP Version(s) =====
Line 43: Line 97:
 ===== RFC Impact ===== ===== RFC Impact =====
 ==== To SAPIs ==== ==== To SAPIs ====
-Describe the impact to CLI, Development web server, embedded PHP etc.+To SAPIs 
 +Will add the aforementioned functions to all PHP environments.
  
 ==== To Existing Extensions ==== ==== To Existing Extensions ====
-mbstring+Adds mb_trim(), mb_ltrim() and mb_rtrim() to the mbstring extension.
  
 ==== To Opcache ==== ==== To Opcache ====
-It is necessary to develop RFC's with opcache in mind, since opcache is a core extension distributed with PHP. +No effect.
- +
-Please explain how you have verified your RFC's compatibility with opcache.+
  
 ==== New Constants ==== ==== New Constants ====
-Describe any new constants so they can be accurately and comprehensively explained in the PHP documentation.+No new constants.
  
 ==== php.ini Defaults ==== ==== php.ini Defaults ====
-If there are any php.ini settings then list: +No changed php.ini settings.
-  * hardcoded default values +
-  * php.ini-development values +
-  * php.ini-production values+
  
 ===== Open Issues ===== ===== Open Issues =====
-Make sure there are no open issues when the vote starts! 
 https://github.com/php/php-src/issues/9216 https://github.com/php/php-src/issues/9216
- 
-===== Unaffected PHP Functionality ===== 
-List existing areas/features of PHP that will not be changed by the RFC. 
- 
-This helps avoid any ambiguity, shows that you have thought deeply about the RFC's impact, and helps reduces mail list noise. 
  
 ===== Future Scope ===== ===== Future Scope =====
Line 77: Line 121:
 Include these so readers know where you are heading and can discuss the proposed voting options. Include these so readers know where you are heading and can discuss the proposed voting options.
  
-===== Patches and Tests ===== +===== Voting =====
-Links to any external patches and tests go here.+
  
-If there is no patchmake it clear who will create a patch, or whether a volunteer to help with implementation is needed. +<doodle title="Multibyte for trim function mb_trimmb_ltrim and mb_rtrim" auth="Yuya Hamada" voteType="single"  closed="true" closeon="2023-11-17T00:00:00Z"> 
- +   * Yes 
-Make it clear if the patch is intended to be the final patch, or is just a prototype. +   * No 
- +</doodle>
-For changes affecting the core language, you should also provide a patch for the language specification.+
  
 ===== Implementation ===== ===== Implementation =====
 https://github.com/php/php-src/pull/12459 https://github.com/php/php-src/pull/12459
- 
-===== References ===== 
-Links to external references, discussions or RFCs 
  
 ===== Rejected Features ===== ===== Rejected Features =====
 Keep this updated with features that were discussed on the mail lists. Keep this updated with features that were discussed on the mail lists.
rfc/mb_trim.1697609805.txt.gz · Last modified: 2023/10/18 06:16 by youkidearitai