PHP RFC: Multibyte for ucfirst, lcfirst functions, mb_ucfirst mb_lcfirst
- Version: 0.1.2
- Date: 2024-01-16
- Author: Author: Yuya Hamada(https://github.com/youkidearitai), youkidearitai@gmail.com
- Status: Implemented
- First Published at: http://wiki.php.net/rfc/mb_ucfirst
Introduction
PHP does not have a multibyte equivalent of ucfirst, lcfirst functions. It is possible to get close enough behavior below:
function mb_ucfirst(string $str, ?string $encoding = null): string { return mb_convert_case(mb_substr($str, 0, 1, $encoding), MB_CASE_TITLE, $encoding) . mb_substr($str, 1, null, $encoding); }
function mb_lcfirst(string $str, ?string $encoding = null): string { return mb_strtolower(mb_substr($str, 0, 1, $encoding), $encoding) . mb_substr($str, 1, null, $encoding); }
However adding a pre-built functions to do this will implobe the readability and clarify of PHP code. And it will standardize how it is done as it can be tricky.
Proposal
Add mb_ucfirst function, mb_lcfirst function.
function mb_ucfirst(string $string, ?string $encoding = null): string
The first character in mb_ucfirst uses Unicode title case.
function mb_lcfirst(string $string, ?string $encoding = null): string
From what I've researched with Unicode, it may not behave as expected in some languages. In that case, please deal with it in userland.
For example, In Vietnamese, the first letter is not always capitalized.
- ngày Quốc khánh 2-9 (September 2nd National Day)
- tiếng Nhật (Japanese)
Another example, In Georgian should uses title case.
- mb_strtoupper(“აბგ”) (ani bani gani, U+10D0 U+10D1 U+10D2) -> ᲐᲑᲒ(U+1C90 U+1C91 U+1C92)
- mb_strtoupper(“lj”)(U+01C9) -> “LJ” (U+01C7)
Correct case.
- mb_ucfirst(“აბგ”) -> “აბგ” (U+10D0 U+10D1 U+10D2)
- mb_ucfirst(“lj”) -> “Lj” (U+01C9 -> U+01C8)
Backward Incompatible Changes
This could break a function existing in userland with the same name.
Proposed PHP Version(s)
next PHP 8.x
RFC Impact
To SAPIs
To SAPIs Will add the aforementioned functions to all PHP environments.
To Existing Extensions
Adds mb_ucfirst(), mb_lcfirst() to the mbstring extension.
To Opcache
No effect.
New Constants
No new constants.
php.ini Defaults
No changed php.ini settings.
Open Issues
Future Scope
This section details areas where the feature might be improved in future, but that are not currently proposed in this RFC.
Voting
Implementation
Rejected Features
Keep this updated with features that were discussed on the mail lists.