====== Unicode Support ====== 
Author: Pierre Joye 
Status: Under discussion 
Unicode still remains one of the top requested features in PHP. 
However as Rasmus and other stated earlier, it is not a trivial job. 
Some of the keys point we need to take care of are: 
* UTF-8 storage 
* UTF-8 support for almost (if not all) existing string APIs 
* Performance 
As of today, I did not find any library covering at least two of these 
key points. 
Please keep in mind that I am by no mean a Unicode expert, and this 
summary is what I gather by reading the ICU and other projects 
documentation and discussions archives. Experiments still have to be 
done. However I rather prefer to discuss the options prior to go wild 
with an implementation (huge task, even for basic features coverage). 
If one of the following statement is wrong or not accurate, please fix 
it. I will keep a dedicated wiki page to summarize the discussions and 
options about unicode support. 
====== ICU ====== 
U_CHARSET_IS_UTF8 allows to force ICU to use UTF-8 by default. It is a 
ICU compile time setting.It is is not possible to set it at PHP 
configure time. It means that users will have to create their own 
build. Alternatively we can bundle ICU but this will be awkward, a 
maintenance nightmare for both php and the distros. 
Alternatively UText can be used to create UTF-8 string. APIs accepting 
UText allow almost everything we need. However the counterpart is that 
a UTF-8 UText is readonly. Any operation altering its content will 
require duplication, clones or conversions. That may kill all gains we 
got from using UTF-8 only. 
The  U_CHARSET_IS_UTF8 is very appealing but to bundle ICU is actually 
show stopper. Asking users to custom build ICU is not an option 
either. I do not know if the distros will be ready to provide two 
different builds of ICU either, it may add a lot of issues with all 
projects using ICU. 
====== UTF8proc ====== 
utf8proc is very attractive, small and relatively fast. I see it as a 
good starting point. However its features cover a very little part of 
what PHP needs.It is easy to bundle but will require a fork and a lot 
of work to add all missing features. 
====== librope ====== 
Same comments than utf8proc, with even less features. 
I would like to begin to discuss our option now already. I am not 
asking to get in all implementation details from a userland point of 
view (like u"some text" or addng new APIs or not) but only to see what 
we can do internally to work with UTF-8 string. 
====== References ====== 
-  * 
-  * 
-  * 
