====== PHP RFC: Multibyte for ucfirst, lcfirst functions, mb_ucfirst mb_lcfirst ======
* Version: 0.1.2
* Date: 2024-01-16
* Author: Author: Yuya Hamada(https://github.com/youkidearitai), youkidearitai@gmail.com
* Status: Implemented
* First Published at: http://wiki.php.net/rfc/mb_ucfirst
===== Introduction =====
PHP does not have a multibyte equivalent of ucfirst, lcfirst functions. It is possible to get close enough behavior below:
function mb_ucfirst(string $str, ?string $encoding = null): string
{
return mb_convert_case(mb_substr($str, 0, 1, $encoding), MB_CASE_TITLE, $encoding) . mb_substr($str, 1, null, $encoding);
}
function mb_lcfirst(string $str, ?string $encoding = null): string
{
return mb_strtolower(mb_substr($str, 0, 1, $encoding), $encoding) . mb_substr($str, 1, null, $encoding);
}
However adding a pre-built functions to do this will implobe the readability and clarify of PHP code. And it will standardize how it is done as it can be tricky.
===== Proposal =====
Add mb_ucfirst function, mb_lcfirst function.
function mb_ucfirst(string $string, ?string $encoding = null): string
The first character in mb_ucfirst uses Unicode title case.
function mb_lcfirst(string $string, ?string $encoding = null): string
From what I've researched with Unicode, it may not behave as expected in some languages. In that case, please deal with it in userland.
For example, In Vietnamese, the first letter is not always capitalized.
* ngày Quốc khánh 2-9 (September 2nd National Day)
* tiếng Nhật (Japanese)
Another example, In Georgian should uses title case.
* mb_strtoupper("აბგ") (ani bani gani, U+10D0 U+10D1 U+10D2) -> ᲐᲑᲒ(U+1C90 U+1C91 U+1C92)
* mb_strtoupper("lj")(U+01C9) -> "LJ" (U+01C7)
Correct case.
* mb_ucfirst("აბგ") -> "აბგ" (U+10D0 U+10D1 U+10D2)
* mb_ucfirst("lj") -> "Lj" (U+01C9 -> U+01C8)
===== Backward Incompatible Changes =====
This could break a function existing in userland with the same name.
===== Proposed PHP Version(s) =====
next PHP 8.x
===== RFC Impact =====
==== To SAPIs ====
To SAPIs Will add the aforementioned functions to all PHP environments.
==== To Existing Extensions ====
Adds mb_ucfirst(), mb_lcfirst() to the mbstring extension.
==== To Opcache ====
No effect.
==== New Constants ====
No new constants.
==== php.ini Defaults ====
No changed php.ini settings.
===== Open Issues =====
[[https://github.com/php/php-src/issues/13075|https://github.com/php/php-src/issues/13075]]
===== Future Scope =====
This section details areas where the feature might be improved in future, but that are not currently proposed in this RFC.
===== Voting =====
* Yes
* No
===== Implementation =====
https://github.com/php/php-src/pull/13161
===== Rejected Features =====
Keep this updated with features that were discussed on the mail lists.