rfc:default_encoding

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

rfc:default_encoding [2013/06/29 11:27]
yohgaki [Proposal]
rfc:default_encoding [2014/02/15 01:50] (current)
yohgaki
Line 1: Line 1:
 ====== Request for Comments: Use default_charset As Default Character Encoding ====== ====== Request for Comments: Use default_charset As Default Character Encoding ======
-  * Version: 0.2 +  * Version: ​1.0.1 
-  * Date: 2013-06-29+  * Date: 2014-01-01
   * Author: Yasuo Ohgaki <​yohgaki@ohgaki.net>​   * Author: Yasuo Ohgaki <​yohgaki@ohgaki.net>​
-  * Status: ​Under Discussion+  * Status: ​Implemented
   * First Published at: http://​wiki.php.net/​rfc/​default_encoding   * First Published at: http://​wiki.php.net/​rfc/​default_encoding
  
Line 13: Line 13:
  
 There are many encoding setting in php.ini and functions that users simply ignore and leave it alone. However, it is required to handle character encoding properly for secure programs. There are many encoding setting in php.ini and functions that users simply ignore and leave it alone. However, it is required to handle character encoding properly for secure programs.
 +
 +Objectives of this proposal are:
 +
 +  - Setting charset in HTTP header is recommended since the first XSS advisory in 2000 Feb. by CERT and Microsoft. (Better security) ​
 +  - There are too many encoding settings and it is better to consolidated. ​
 +  - If we have yet another multibyte string module in the future, the new common ini settings can be used. (No more module specific INIs)
  
 ===== Proposal ===== ===== Proposal =====
Line 25: Line 31:
 Use **default_charset as default** for encoding related php.ini settings and module/​functions. Use **default_charset as default** for encoding related php.ini settings and module/​functions.
  
-Not tuoched+Not touched
   * zend.script_encoding   * zend.script_encoding
  
-PHP 5.and master, introduce new php.ini ​settting. Old iconv.*/​mbstring.* php.ini parameters will be removed for master.+PHP 5.and master, introduce new php.ini ​setting. Old iconv.*/​mbstring.* php.ini parameters will be removed for <del>master</​del>​ PHP6. Use of iconv.*/​mbstring.* php.ini parameters raise E_DEPRECATED for 5.6 and up.
   * php.input_encoding   * php.input_encoding
   * php.internal_encoding   * php.internal_encoding
   * php.output_encoding   * php.output_encoding
- 
-PHP 5.5 
   * iconv.input_encoding (Default: php.input_encoding)   * iconv.input_encoding (Default: php.input_encoding)
   * iconv.internal_encoding (Default: php.internal_encoding)   * iconv.internal_encoding (Default: php.internal_encoding)
   * iconv.output_encoding (Default: php.output_encoding)   * iconv.output_encoding (Default: php.output_encoding)
-  * mbstring.http_input (Default: php.output_encoding)+  * mbstring.http_input (Default: php.input_encoding)
   * mbstring.internal_encoding (Default: php.internal_encoding)   * mbstring.internal_encoding (Default: php.internal_encoding)
   * mbstring.http_output (Default: php.output_encoding)   * mbstring.http_output (Default: php.output_encoding)
-  * all functions that take encoding option use php.internal_encording ​as default (e.g. htmlentities/​mb_strlen/​mb_regex/​etc)+  * all functions that take encoding option use php.internal_encoding ​as default (e.g. htmlentities/​mb_strlen/​mb_regex/​etc)
  
-PHP 5.4+PHP 5.5
   * leave as it is now   * leave as it is now
  
Line 62: Line 66:
   * mbstring has API to check encoding is supported or not.   * mbstring has API to check encoding is supported or not.
   * users are responsible to set proper encoding name. e.g. mbstring has SJIS-win, but iconv only has SJIS   * users are responsible to set proper encoding name. e.g. mbstring has SJIS-win, but iconv only has SJIS
 +  * if encoding names conflicts, users should set module specific ini to valid(non empty) value 
 +
 +
 +==== Use cases ====
 +
 +It simplify i18n applications.
 +
 +Unifies *.output_encoding/​*.internal_encoding/​*.input_encoding setting.
 +
 +Users may check default_charset see if encoding conversion is needed or not. For example, pcre/sqlite only suports UTF-8 and users may check & convert encoding as follows.
 +
 +  if (ini_get('​default_charset'​) !== '​UTF-8'​) {
 +     $str = mb_convert_encoding($str,​ '​UTF-8'​); ​
 +  }
 +  preg, sqlite function calls here.
 +
 +
 +==== Other related issues ====
 +
 +escapeshellcmd/​escapeshellarg/​fgetcsv or like, are using locale based MBCS support via php_mblen(). These functions are out side of this RFC scope.
 +
 +Database character encoding is also out side of this RFC scope.
 +
 +
 +==== BC issues ====
 +
 +None when users already using UTF-8 as their encoding.
 +
 +Other users may have to change "​default_encoding"​ php.ini setting (leave it empty or set it to desired encoding)
  
 ===== Patch ===== ===== Patch =====
Line 67: Line 100:
 PoC Patch  PoC Patch 
 https://​github.com/​yohgaki/​php-src/​compare/​master-unified-encoding-setting https://​github.com/​yohgaki/​php-src/​compare/​master-unified-encoding-setting
 +===== Vote Options =====
 +
 +  * Yes
 +  * No
 +
 +
 ===== Vote ===== ===== Vote =====
  
-Not yet+Vote start: 2013/12/20 01:00 UTC 
 + 
 +Vote end: 2014/01/10 01:00 UTC 
 + 
 +<doodle title="​Default Character Encoding"​ auth="​yohgaki"​ voteType="​single"​ closed="​true">​ 
 +   * Yes 
 +   * No 
 +</​doodle>​
  
 ===== References ===== ===== References =====
  
   * Internals discussion - http://​www.serverphorums.com/​read.php?​7,​552099,​552110   * Internals discussion - http://​www.serverphorums.com/​read.php?​7,​552099,​552110
 +
 +
 +===== Implementation =====
 +
 +  * http://​git.php.net/?​p=php-src.git;​a=commit;​h=cbd108abf19d9fb9ae1d4ccd153215f56a2763e8
 +  * http://​svn.php.net/​viewvc?​view=revision&​revision=332850
 +  * http://​svn.php.net/​viewvc?​view=revision&​revision=332851
 +  * http://​svn.php.net/​viewvc?​view=revision&​revision=332852
 +  * http://​svn.php.net/​viewvc?​view=revision&​revision=332857
  
 ===== Changelog ===== ===== Changelog =====
  
 +  * 2014-01-01 Revised unneeded php.ini removal process. ​
 +  * 2013-12-17 Added use case and related issues.
 +  * 2013-10-31 Added objectives.
 +  * 2013-10-29 Update target PHP version to 5.6.0.
   * 2013-06-29 Add PoC patch and update RFC, since PHP 5.5 has been released.   * 2013-06-29 Add PoC patch and update RFC, since PHP 5.5 has been released.
   * 2012-08-31 Initial version. (yohgaki)   * 2012-08-31 Initial version. (yohgaki)
rfc/default_encoding.txt · Last modified: 2014/02/15 01:50 by yohgaki