PHP RFC: IntlChar class
- Version: 1.0
- Date: 2014-11-24
- Author: Sara Golemon, pollita@php.net
- Status: Accepted
- First Published at: http://wiki.php.net/rfc/intl.char
Introduction
ICU exposes a great deal of i18n/l10n functionality beyond what is currently exposed by PHP. This RFC seeks to expose just a little bit more...
Proposal
Expose additional ICU functionality from uchar.h as IntlChar::*() following the ICU API as much as possible.
See hphp/runtime/ext/icu/ext_icu_uchar.php in https://reviews.facebook.net/D30573 for a full breakdown of the functions (complete with docblocks). Constants can be found in either PR's *-enum.h files.
Proposed PHP Version(s)
PHP 7 (or 5.next if there is one)
New Constants
Enumerations of UProperty, UCharNameChoice, UPropertyNameChoice, UCharDirection, UBlockCode, etc... For example:
class IntlChar { const PROPERTY_ALPHABETIC = _UCHAR_ALPHABETIC_; const PROPERTY_ASCII_HEX_DIGIT = _UCHAR_ASCII_HEX_DIGIT_; /* etc... */ }
New Static Methods
Mapping of ICU API to PHP. For example:
class IntlChar { static public function hasBinaryProperty(int $codepoint, int $property): bool; static public function isAlphabetic(int $codepoint): bool; /* etc... */ }
Note that properties taking a codepoint will accept either an integer codepoint value (e.g. 0x2603 for U+2603 SNOWMAN), or the character encoded as UTF-8 (e.g. “\xE2\x98\x83”). For methods which return a codepoint, they will return int unless they accepted a codepoint as a utf-8 string, in which case they remain utf-8.
Notes
I also added IntlChar::chr() and IntlChar::ord() which aren't directly part of the API, but they made sense as wrappers for the U8_*() family of macros.
Some methods take a range in the form ($start, $limit) which the range is INclusive of $start, and EXclusive of $limit. i.e. (0x20, 0x30) => 0x20..0x2F. I kept this meaning for $limit to stay consistent with the ICU API, but changing $limit to have the semantics of $end would probably make more sense in PHP.
Vote
- Voting opened: 2014-12-26 02:20 UTC
- Voting closed: 2015-01-16 17:20 UTC (3 week voting period)