rfc:escaper

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
rfc:escaper [2012/09/19 10:28] padraicrfc:escaper [2018/06/18 10:11] (current) – This RFC appears to be inactive cmb
Line 2: Line 2:
   * Version: 1.0.1   * Version: 1.0.1
   * Date: 2012-09-18   * Date: 2012-09-18
-  * Author: Pádraic Brady <padraic.brady.at.gmail.com> +  * Author: Pádraic Brady <padraic.brady.at.gmail.com>, Yasuo Ohgaki <yohgaki@php.net
-  * Status: Under Discussion+  * Status: Inactive
   * First Published at: http://wiki.php.net/rfc/escaper   * First Published at: http://wiki.php.net/rfc/escaper
 +
 +===== Change Log =====
 +  * 2012-09-18 Initial version edited from https://gist.github.com/gists/3066656
 +  * 2013-09-27 Added ext/filter implementation as an option (Yasuo)
  
 ===== Introduction ===== ===== Introduction =====
Line 38: Line 42:
 Including more narrowly defined and specifically targeted functions or SPL class methods into PHP will simplify the whole situation for users, offer a cohesive approach to escaping, rectify PHP's situation as only offering a partial XSS defense and, by its presence in Core, displace function misuse and homegrown escaping functions. Including more narrowly defined and specifically targeted functions or SPL class methods into PHP will simplify the whole situation for users, offer a cohesive approach to escaping, rectify PHP's situation as only offering a partial XSS defense and, by its presence in Core, displace function misuse and homegrown escaping functions.
  
-===== SPL Class or Functions? =====+===== Escape filter for ext/filter =====
  
-While it may well be very feasible and probably advisable to do both, I have a strong preference for classes coming from a framework heavily dependent on them and would suggest a class structure that implements the following interface:+Implementation option as filter. 
 + 
 +^ ID(Constant) ^ Name ^ Options ^ Description ^ 
 +|FILTER_ESCAPE_HTML |"escape_html"|Escape HTML document | | 
 +|FILTER_ESCAPE_HTML_ATTR |"escape_html_attr" |Escape HTML tag attribute | | 
 +|FILTER_ESCAPE_JAVASCRIPT |"escape_javascript" |Escape JavaScript string | | 
 +|FILTER_ESCAPE_CSS |"escape_css" |Escape CSS attribute | | 
 +|FILTER_ESCAPE_URI |"escape_uri" |Escape URI parameters | | 
 +|FILTER_ESCAPE_XML |"escape_xml" |Escape XML document |Alias of FILTER_ESCAPE_HTML | 
 +|FILTER_ESCAPE_XML_ATTR |"escape_xml_attr" |Escape XML tag attribute | | 
 + 
 + 
 +===== SPL Class ===== 
 + 
 +While it may well be advisable to do both, I have a strong preference for classes coming from a framework heavily dependent on them and would suggest a class structure that implements the following interface in addition to any standalone functions:
  
 <code php> <code php>
-    interface Escaper+    interface SPL_Escaper
     {     {
         public function __construct($encoding = 'UTF-8');         public function __construct($encoding = 'UTF-8');
Line 71: Line 89:
  
 The benefits of the class are to allow the centralised setting of a character encoding once and then being able to pass around the object across an entire application or library allowing it to be configured from a single location. This could be created in userland PHP around a set of functions but it seems silly to skip an obviously beneficial step to users. The benefits of the class are to allow the centralised setting of a character encoding once and then being able to pass around the object across an entire application or library allowing it to be configured from a single location. This could be created in userland PHP around a set of functions but it seems silly to skip an obviously beneficial step to users.
 +
 +===== Functions =====
  
 Functions may then be added along the following lines (names up for discussion): Functions may then be added along the following lines (names up for discussion):
  
   * escape_html($value, $encoding);   * escape_html($value, $encoding);
- 
   * escape_html_attribute($value, $encoding);   * escape_html_attribute($value, $encoding);
- 
   * escape_javascipt($value, $encoding);   * escape_javascipt($value, $encoding);
- 
   * escape_css($value, $encoding);   * escape_css($value, $encoding);
- 
   * escape_url($value, $encoding);   * escape_url($value, $encoding);
- 
   * escape_xml($value, $encoding);   * escape_xml($value, $encoding);
- 
   * escape_xml_attribute($value, $encoding);   * escape_xml_attribute($value, $encoding);
  
-I am strongly opposed to allowing these functions accept unpredictable character encoding directives via php.ini. That would require additional work to validate which is precisely what this RFC should seek to avoid.+===== Implementation Notes =====
  
-As there is no means of centrally configuring a character encoding allowed in this RFC proposal, the second parameter to these functions is explicitly required and has no default value. This works to undo the common practice in PHP where htmlspecialchars() calls omit all or most of its optional parameters. An application containing anything from thousands to tens of thousands of such function calls is extremely difficult to reconfigure at a later date and abusing the notion that all character encodings are equivalent to UTF-8 for special characters is itself definitely subject to infrequent browser bugs (e.g. IE6 is susceptible to character deletion when UTF-8 strings are escaped to a ISO-8859 encoding).+IMPORTANT: Since proper escape requires proper character encoding handling, multibyte string feature in core is mandatory for implementation. 
 + 
 +I am strongly opposed to allowing these functions accept unpredictable character encoding directives via php.ini. That would require additional work to validate which is precisely what this RFC should seek to avoid. By validation, I mean having programmers determine how dependencies implement escaping, what encoding they enforce (usually the default), and then determining if it can be changed by the depending applications or if the library must be forked, re-edited, etc. Those who are concious of security will review dependencies for such issues rather than blindly trust dependencies. 
 + 
 +As there is no means of globally configuring a character encoding allowed in this RFC proposal since it promotes unconfigurable-default assumptions (already evidenced by existing htmlspecialchars() usage - [[https://github.com/search?q=htmlspecialchars&repo=&langOverride=&start_value=1&type=Code&language=PHP|search Github]]), the second parameter to these functions is explicitly required and has no default value. This works to undo the common practice in PHP where htmlspecialchars() calls omit all or most of its optional parameters. An application containing anything from thousands to tens of thousands of such function calls is extremely difficult to reconfigure at a later date and abusing the notion that all character encodings are equivalent to UTF-8 for special characters is itself definitely subject to infrequent browser bugs (e.g. IE6 is susceptible to character deletion when UTF-8 strings are escaped to a ISO-8859 encoding).
  
 I have assumed that the character encodings supported are limited to those presently allowed by htmlspecialchars() and that the internals of each method or function validate this fact or throw an Exception (or an error for function calls) to prevent continued insecure execution as is currently allowed by htmlspecialchars(). See links below. I have assumed that the character encodings supported are limited to those presently allowed by htmlspecialchars() and that the internals of each method or function validate this fact or throw an Exception (or an error for function calls) to prevent continued insecure execution as is currently allowed by htmlspecialchars(). See links below.
Line 99: Line 117:
 Symfony's Twig also recently added similar escaping options: Symfony's Twig also recently added similar escaping options:
 https://github.com/fabpot/Twig/raw/master/lib/Twig/Extension/Core.php https://github.com/fabpot/Twig/raw/master/lib/Twig/Extension/Core.php
- 
 ===== Class Method Dissection ===== ===== Class Method Dissection =====
  
Line 152: Line 169:
 The essence of this RFC is to propose including basic safe escaping functionality within PHP which addresses the need to apply context-specific escaping in web applications. By offering a simple consistent approach, it affords the opportunity to implement these specifically to target XSS and to omit other functionality that some native functions include, and which can be problematic to programmers or doesn't go far enough. Centralising escaping functionality into one consistent package would, I believe, be one more small step to improving the application of escaping in PHP. The essence of this RFC is to propose including basic safe escaping functionality within PHP which addresses the need to apply context-specific escaping in web applications. By offering a simple consistent approach, it affords the opportunity to implement these specifically to target XSS and to omit other functionality that some native functions include, and which can be problematic to programmers or doesn't go far enough. Centralising escaping functionality into one consistent package would, I believe, be one more small step to improving the application of escaping in PHP.
  
-===== Changelog ===== 
  
-  * 2012-09-18 Initial version edited from https://gist.github.com/gists/3066656 
rfc/escaper.1348050487.txt.gz · Last modified: 2017/09/22 13:28 (external edit)