====== PHP RFC: IntlChar class ====== * Version: 1.0 * Date: 2014-11-24 * Author: Sara Golemon, pollita@php.net * Status: Accepted * First Published at: http://wiki.php.net/rfc/intl.char ===== Introduction ===== ICU exposes a great deal of i18n/l10n functionality beyond what is currently exposed by PHP. This RFC seeks to expose just a little bit more... ===== Proposal ===== Expose additional ICU functionality from [[http://icu-project.org/apiref/icu4c/uchar_8h.html|uchar.h]] as IntlChar::*() following the ICU API as much as possible. See hphp/runtime/ext/icu/ext_icu_uchar.php in [[https://reviews.facebook.net/D30573]] for a full breakdown of the functions (complete with docblocks). Constants can be found in either PR's *-enum.h files. ===== Proposed PHP Version(s) ===== PHP 7 (or 5.next if there is one) ==== New Constants ==== Enumerations of UProperty, UCharNameChoice, UPropertyNameChoice, UCharDirection, UBlockCode, etc... For example: class IntlChar { const PROPERTY_ALPHABETIC = _UCHAR_ALPHABETIC_; const PROPERTY_ASCII_HEX_DIGIT = _UCHAR_ASCII_HEX_DIGIT_; /* etc... */ } ==== New Static Methods ==== Mapping of ICU API to PHP. For example: class IntlChar { static public function hasBinaryProperty(int $codepoint, int $property): bool; static public function isAlphabetic(int $codepoint): bool; /* etc... */ } Note that properties taking a codepoint will accept either an integer codepoint value (e.g. 0x2603 for U+2603 SNOWMAN), or the character encoded as UTF-8 (e.g. "\xE2\x98\x83"). For methods which return a codepoint, they will return int unless they accepted a codepoint as a utf-8 string, in which case they remain utf-8. ===== Notes ===== I also added IntlChar::chr() and IntlChar::ord() which aren't directly part of the API, but they made sense as wrappers for the U8_*() family of macros. Some methods take a range in the form ($start, $limit) which the range is INclusive of $start, and EXclusive of $limit. i.e. (0x20, 0x30) => 0x20..0x2F. I kept this meaning for $limit to stay consistent with the ICU API, but changing $limit to have the semantics of $end would probably make more sense in PHP. ===== Vote ===== * Yes * No * Voting opened: 2014-12-26 02:20 UTC * Voting closed: 2015-01-16 17:20 UTC (3 week voting period) ===== Implementation ===== * PHP7: [[https://github.com/sgolemon/php-src/compare/intl.uchar|github/sgolemon/php-src/intl.uchar]] * HHVM [[https://reviews.facebook.net/D30573]]