rfc:mb_str_pad

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
rfc:mb_str_pad [2023/05/19 20:07] – wording nielsdosrfc:mb_str_pad [2023/11/13 19:55] (current) – link to docs nielsdos
Line 1: Line 1:
 ====== PHP RFC: mb_str_pad ====== ====== PHP RFC: mb_str_pad ======
-  * Version: 0.1+  * Version: 0.1.2
   * Date: 2023-05-19   * Date: 2023-05-19
   * Author: Niels Dossche (nielsdos), dossche.niels@gmail.com   * Author: Niels Dossche (nielsdos), dossche.niels@gmail.com
-  * Status: Draft+  * Status: [[https://github.com/php/php-src/commit/68591632b22289962127cf777b4c3aeaea768bb6|Implemented]] 
 +  * Target Version: PHP 8.3 
 +  * Implementation: https://github.com/php/php-src/pull/11284
   * First Published at: http://wiki.php.net/rfc/mb_str_pad   * First Published at: http://wiki.php.net/rfc/mb_str_pad
  
Line 10: Line 12:
  
 ===== Proposal ===== ===== Proposal =====
-This proposal aims to introduce a new mbstring function mb_str_pad(). Both the input string and the padding string may be multibyte strings. The function follows the same signature as the str_pad() function, except that it has an additional argument for the string encoding. The encoding argument works analogously to the encoding argument of other mbstring functions. The encoding applies on both $input and $pad_string. If the encoding is null, the default mbstring encoding is used. The $pad_type argument can take three possible values: STR_PAD_LEFT, STR_PAD_RIGHT, STR_PAD_BOTH. str_pad() uses the same constants.+This proposal aims to introduce a new mbstring function mb_str_pad(). Both the input string and the padding string may be multibyte strings. The function follows the same signature as the str_pad() function, except that it has an additional argument for the string encoding. The encoding argument works analogously to the encoding argument of other mbstring functions. The encoding applies to both $input and $pad_string. If the encoding is null, the default mbstring encoding is used. The $pad_type argument can take three possible values: STR_PAD_LEFT, STR_PAD_RIGHT, STR_PAD_BOTH. str_pad() uses the same constants.
  
 <code php> <code php>
 function mb_str_pad(string $string, int $length, string $pad_string = " ", int $pad_type = STR_PAD_RIGHT, ?string $encoding = null): string {} function mb_str_pad(string $string, int $length, string $pad_string = " ", int $pad_type = STR_PAD_RIGHT, ?string $encoding = null): string {}
 </code> </code>
 +
 +This proposal defines character as code point, which is how the other mbstring functions define characters as well.
  
 ==== Error conditions ==== ==== Error conditions ====
Line 69: Line 73:
 Since this is a new function and no existing functions change, there is no behavioural backwards incompatibility. The only backwards compatible break occurs when a userland PHP project declares their own mb_str_pad() function without first checking if PHP doesn't already declare it. In that case, a fatal error "Cannot redeclare mb_str_pad()" will be thrown. Since this is a new function and no existing functions change, there is no behavioural backwards incompatibility. The only backwards compatible break occurs when a userland PHP project declares their own mb_str_pad() function without first checking if PHP doesn't already declare it. In that case, a fatal error "Cannot redeclare mb_str_pad()" will be thrown.
  
-I did a quick search using GitHub's [[https://github.com/search?q=mb_str_pad+lang%3Aphp&type=code|code search]] on "mb_str_pad" in PHP files, and found 326 matches. This also gives us an opportunity to look at how many correct vs incorrect implementations there are.+I did a quick search using GitHub's [[https://github.com/search?q=mb_str_pad+lang%3Aphp&type=code|code search]] on "mb_str_pad" in PHP files, and found 326 matches (as of 2023-05-19). This also gives us an opportunity to look at how many correct vs incorrect implementations there are.
  
-Looking at the function / method declarations for "mb_str_pad": +Looking at the function / method //declarations// for "mb_str_pad": 
-  * 47 in class +  * 47 in classes 
-  * 12 free function, checked+  * 12 free functions, checked if PHP doesn't already declare it
   * 42 free functions, not checked (correctly)   * 42 free functions, not checked (correctly)
  
-This means that for 42 implementations, the introduction of mb_str_pad() will cause a fatal error as described above. Fortunately, it is simply a matter of removing their implementation, or guarding it with a check to resolve the error.+This means that for 42 implementations, the introduction of mb_str_pad() will cause a fatal error as described above. Fortunately, the users can simply remove their implementation, or guard it with a check to resolve the error.
  
 Let's also take a look at correctness: Let's also take a look at correctness:
-  * 36 likely correct implementations. +  * 36 likely correct implementations. I did not test or read them thoroughly, I just ran some inputs through them automatically
-  * 65 implementations which break if the padding string is a multibyte string. Almost all these implementations are very very similar to each other.+  * 65 implementations which break if the padding string is a multibyte string. Almost all these implementations are very similar to each other.
  
 As we can see it appears to be a function that's a little tricky to implement correctly. As we can see it appears to be a function that's a little tricky to implement correctly.
-Note that these results don't include numbers for inline implementations or for implementations under a different name, but that doesn't matter for a backwards compatibility check.+Note that these results don't include numbers for inline implementations or for implementations under a different name. Hence the reported numbers are quite low. It is very likely more implementations exist under different names, but that doesn't matter for a backwards compatibility check.
  
 ===== Proposed PHP Version(s) ===== ===== Proposed PHP Version(s) =====
Line 111: Line 115:
  
 ===== Future Scope ===== ===== Future Scope =====
-None.+In the future we could add a string padding function that works on grapheme clusters instead of code points: grapheme_str_pad(). This should be added to ext/intl. This will of course require another RFC.
  
 ===== Proposed Voting Choices ===== ===== Proposed Voting Choices =====
-One primary yes/no vote to decide if the function may be introduced.+One primary yes/no vote to decide if the function may be introduced, requires 2/3 majority. 
 + 
 +Voting starts on 2023-06-05 20:00 GMT+2, and ends on 2023-06-19 20:00 GMT+2. 
 + 
 +<doodle title="mb_str_pad" auth="nielsdos" voteType="single" closed="true" closeon="2023-06-19T20:00:00+02:00"> 
 +   * Yes 
 +   * No 
 +</doodle>
  
 ===== Patches and Tests ===== ===== Patches and Tests =====
Line 121: Line 132:
 ===== Implementation ===== ===== Implementation =====
 After the project is implemented, this section should contain  After the project is implemented, this section should contain 
-  - the version(s) it was merged into +  - the version(s) it was merged into: PHP 8.3 
-  - a link to the git commit(s) +  - a link to the git commit(s): https://github.com/php/php-src/commit/68591632b22289962127cf777b4c3aeaea768bb6 
-  - a link to the PHP manual entry for the feature +  - a link to the PHP manual entry for the feature: https://www.php.net/manual/en/function.mb-str-pad 
-  - a link to the language specification section (if any)+  - a link to the language specification section (if any): N/A
  
 ===== References ===== ===== References =====
Line 131: Line 142:
 ===== Rejected Features ===== ===== Rejected Features =====
 Keep this updated with features that were discussed on the mail lists. Keep this updated with features that were discussed on the mail lists.
 +
 +===== Changelog =====
 +
 +  * 0.1.2: Clarify that we use the mbstring definition of character (i.e. code point) instead of grapheme cluster.
 +  * 0.1.1: Initial version placed under discussion
rfc/mb_str_pad.1684526866.txt.gz · Last modified: 2023/05/19 20:07 by nielsdos