rfc:on_demand_name_mangling

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
rfc:on_demand_name_mangling [2016/01/04 20:08] – Added Q&A from Rouven Wessling bishoprfc:on_demand_name_mangling [2019/07/16 12:25] (current) – Settled on formal polyfill name, php_mangle_superglobal bishop
Line 1: Line 1:
 ====== PHP RFC: On-demand Name Mangling ====== ====== PHP RFC: On-demand Name Mangling ======
-  * Version: 1.2a+  * Version: 1.4
   * Created Date: 2016-01-01   * Created Date: 2016-01-01
-  * Updated Date: 2016-01-04 +  * Updated Date: 2019-07-16 
-  * Author: Bishop Bettinibishop@php.net+  * Author: Bishop Bettini <bishop@php.net>
   * Status: Under Discussion   * Status: Under Discussion
-  * First Published at: http://wiki.php.net/rfc/remove_name_mangling+  * First Published at: http://wiki.php.net/rfc/on_demand_name_mangling
  
 ===== Introduction ===== ===== Introduction =====
Line 50: Line 50:
 </code> </code>
  
-Mangling has the undesirable consequence that //many// external variables may map to //one// PHP variable. For example, three separate HTML form elements named ''a.b'', ''a_b'' and ''a b'' will all resolve to ''a_b'' in the corresponding super-global, with the last seen value winning. This leads to user confusion and userland work arounds, not to mention bug reports: [[https://bugs.php.net/bug.php?id=34882|#34882]] and [[https://bugs.php.net/bug.php?id=42055|#42055]] for example.+Mangling has the undesirable consequence that //many// external variables may map to //one// PHP variable. For example, three separate HTML form elements named ''a.b'', ''a_b'' and ''a[b'' will all resolve to ''a_b'' in the corresponding super-global, with the value from ''a[b'' winning (because it was last). This leads to user confusion and userland work arounds, not to mention bug reports: [[https://bugs.php.net/bug.php?id=34882|#34882]] and [[https://bugs.php.net/bug.php?id=42055|#42055]] for example.
  
 Automatic name mangling supported ''[[http://php.net/manual/en/ini.core.php#ini.register-globals|register_globals]]'' and its kin like ''[[http://php.net/manual/en/function.import-request-variables.php|import_request_variables()]]'', but those features ended in August 2014. Name mangling isn't required for super-global marshaling, because the associative array nature of super-globals can accommodate any string variable name. So do we need automatic name mangling? Consider this hypothetical new test: Automatic name mangling supported ''[[http://php.net/manual/en/ini.core.php#ini.register-globals|register_globals]]'' and its kin like ''[[http://php.net/manual/en/function.import-request-variables.php|import_request_variables()]]'', but those features ended in August 2014. Name mangling isn't required for super-global marshaling, because the associative array nature of super-globals can accommodate any string variable name. So do we need automatic name mangling? Consider this hypothetical new test:
Line 56: Line 56:
 <code> <code>
 --TEST-- --TEST--
-Name mangling logic moved to extract()+Name mangling logic removed from engine, placed in polyfill
 --GET-- --GET--
-a.b=dot&a$b=dollar&a%20b=space&a[b=bracket+a.b=dot&a_b=underscore&a$b=dollar&a%20b=space&a[b=bracket
 --FILE-- --FILE--
 <?php <?php
 print_r(get_defined_vars()); print_r(get_defined_vars());
-mangle_superglobals();+php_mangle_superglobals();
 print_r(get_defined_vars()); print_r(get_defined_vars());
 ?> ?>
Line 71: Line 71:
         (         (
             [a.b] => dot             [a.b] => dot
 +            [a_b] => underscore
             [a$b] => dollar             [a$b] => dollar
             [a b] => space             [a b] => space
Line 80: Line 81:
     [_GET] => Array     [_GET] => Array
         (         (
-            [a.b] => dot+            [a_b] => bracket
             [a$b] => dollar             [a$b] => dollar
-            [a b] => space 
-            [a[b] => bracket 
         )         )
- 
-    [a_b] => bracket 
 ) )
 </code> </code>
  
-In this new implementation, the engine no longer mangles marshaled superglobals at startup.  Instead, the //ability// to mangle names has moved to an optional, userland-provided polyfill function ''mangle_superglobals()'' The polyfill algorithm is simple:+In this new implementation, the engine no longer mangles marshaled superglobals at startup.  Instead, the //ability// to mangle names has moved to an optional, userland-provided polyfill function ''php_mangle_superglobals()''. 
 + 
 +In the example above, an ''a_b'' key was externally supplied. The call to ''php_mangle_superglobals'' clobbered the original value of ''a_b'' with the value of the //last// seen mangle-equivalent key (''a[b''). 
 + 
 +Importantly, the user made this mangling happen: the engine did not do it automatically. 
 + 
 +The polyfill algorithm is simple:
  
   * find all superglobal keys that violate the PHP unquoted variable name regex ((Unquoted variable names must match the regex ''[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*''))   * find all superglobal keys that violate the PHP unquoted variable name regex ((Unquoted variable names must match the regex ''[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*''))
-  * for each, create a new mangled key linked to the corresponding value.+  * for each, create a new mangled key linked to the corresponding value
  
 Applications requiring name mangling may call the polyfill during their bootstrap phase to emulate prior engine behavior. Applications requiring name mangling may call the polyfill during their bootstrap phase to emulate prior engine behavior.
  
 ===== Proposal ===== ===== Proposal =====
-This RFC proposes to phase out automatic name mangling, replacing it with on-demand mangling in ''extract()''+This RFC proposes to remove automatic name mangling, with backward compatibility maintained through a userspace polyfill function that mangles super-globals on-demand: 
  
-  * Next minor release (currently 7.1)+  * Upon acceptance
-    * Emit an ''E_DEPRECATED'' warning the first time a variable is mangled. The warning indicates that name mangling on import will be removed in the next major PHP version.+    * Update documentation that name mangling is deprecated and will be removed in 8.
 +    * Release a userland polyfill that implements the historic mangling behavior 
 +    * Polyfill shall be available via composer (but not PEAR)
   * Next major release (currently 8.0):   * Next major release (currently 8.0):
-    * Remove all name mangling code in super-global marshalling functions +    * Remove all name mangling code in super-global marshaling functions
-    * Release a userland polyfill implementing ''mangle_superglobals'' that is available as a composable package+
  
 ==== Discussion ==== ==== Discussion ====
Line 110: Line 114:
 These questions were raised in the mailing list discussion. These questions were raised in the mailing list discussion.
  
-=== Should multiple ''E_DEPRECATED'' be emitted? ===+=== Should a notice be raised if the engine mangles a superglobal? ===
  
-Nobecause we do not know how many instances of mangling may be present and we do not want to flood application logs.  The proposed single message intends to provide //some// warning to application developers when there is //known// use of name mangling. This notice is similar to the behavior of the ''datetime.timezone'' INI option: at most once per startup.+Before version 1.3this RFC proposed raising an ''E_DEPRECATED'' message (once per startup) when the engine mangled a name, so that developers were made aware of future changesHowever, Rouven Weßling asked:
  
-=== How can I disable the ''E_DEPRECATED'' notice===+> If I have a well behaved application that doesn’t rely on name mangling or have included the polyfill, how can I prevent a log message from being emitted when a user appends (unused) parameters to the query string that require mangling?
  
-Rouven Weßling asked:+and Nikita Popov commented:
  
-If I have well behaved application that doesn’t rely on name mangling or have included the polyfillhow can I prevent a log message from being emitted when user appends (unused) parameters to the query string that require mangling? +Even if it's only single deprecation warning instead of multipleit's still deprecation warning that I, as the application authorhave absolutely no control overFor me, a deprecation warning indicates that there is some code I must change to make that warning *go away*. 
- +> Sure, it's informative. But it'enough to be informative about this *once*, rather than every time user makes an odd-ish request.
-As writtenone can't: the engine emits the error as soon as it mangles a variableat most one time per startupWhile that's annoying, it'also informative: someone'hitting your application in way that may not expect.+
  
-This behavior is similar to [[http://php.net/manual/en/info.configuration.php#ini.max-input-vars|max_input_vars]], which emits a warning as soon as more variables are sent than PHP is configured to acceptThis message happens at most once per startupThere's no way to turn the message off: the best one can do is increase the upper limit in the configuration file.+Given that (a) an application could get spammed by malicious users((The ''max_input_vars'' configuration option behaves similarly with the once-per-startup deprecation message proposed prior to version 1.3The difference is the ''max_input_vars'' message could be squelched by increasing the limit, whereas the proposed mangling message could never be squelched by user code)), and (b) that documentation suffices to notify users of this change, then the RFC changed as of 1.3 to only document the removal of name mangling as of the next major version.
  
 === Should an INI configuration control mangling? === === Should an INI configuration control mangling? ===
Line 132: Line 135:
 An INI setting to disable mangling must be engine-wide (e.g., ''PHP_INI_SYSTEM'' or ''PHP_INI_PERDIR'') as its historical effect occurs before userland code runs. Engine-wide settings are tricky because they force conditions across all instances of PHP running in a given SAPI process.  In a hosted environment where many unrelated sites share the same engine configuration, it's possible that one site might require mangling while another site requires no-mangling.  These two sites could not co-exist unless the site operator allows per directory configuration, which they may not. Thus, an INI setting would introduce operational problems for some definable sub-set of users. An INI setting to disable mangling must be engine-wide (e.g., ''PHP_INI_SYSTEM'' or ''PHP_INI_PERDIR'') as its historical effect occurs before userland code runs. Engine-wide settings are tricky because they force conditions across all instances of PHP running in a given SAPI process.  In a hosted environment where many unrelated sites share the same engine configuration, it's possible that one site might require mangling while another site requires no-mangling.  These two sites could not co-exist unless the site operator allows per directory configuration, which they may not. Thus, an INI setting would introduce operational problems for some definable sub-set of users.
  
-It's still possible to provide an "escape hatch" for applications requiring name mangling: the polyfill described eariler. Applications need only include the polyfill code and add it to their bootstrapping. The polyfill would be available via Composer, and the polyfill would populate all the mangled variables as before.+It's still possible to provide an "escape hatch" for applications requiring name mangling: the polyfill described earlier. Applications need only include the polyfill code and add it to their bootstrapping. The polyfill would be available via Composer, and the polyfill would populate all the mangled variables as before.
  
 The polyfill approach is considered superior to the INI approach for three reasons: The polyfill approach is considered superior to the INI approach for three reasons:
Line 142: Line 145:
 === Should ''extract()'' automatically mangle names? === === Should ''extract()'' automatically mangle names? ===
  
-Early versions of this proposal (< v1.2) proposed using extract to mangle names. Rowan Collins and others pointed out this was an unnecessary complication: ''preg_match'' could also accomplish the goal.  Thus, all references to ''extract'' in this RFC have been removed.+Early versions of this proposal (< v1.2) proposed using ''extract'' to mangle names. Rowan Collins and others pointed out this was an unnecessary complication: ''preg_match'' could also accomplish the goal.  Thus, all references to ''extract'' in this RFC have been removed.
  
 However, ''extract()'' should have the option to emit mangled names with a new constant (''EXTR_MANGLE'').  ''extract()'' should also be fixed to export variables with any variable name, because they are all technically valid with the quoted variable syntax (''${'foo.bar'}'').  These will be handled as function fixes and not with this RFC. However, ''extract()'' should have the option to emit mangled names with a new constant (''EXTR_MANGLE'').  ''extract()'' should also be fixed to export variables with any variable name, because they are all technically valid with the quoted variable syntax (''${'foo.bar'}'').  These will be handled as function fixes and not with this RFC.
Line 152: Line 155:
  
 <code php> <code php>
-function mangle_name($name) {+function php_mangle_name($name) {
     $name = preg_replace('/[^a-zA-Z0-9_\x7f-\xff]/', '_', $name);     $name = preg_replace('/[^a-zA-Z0-9_\x7f-\xff]/', '_', $name);
     return preg_replace('/^[0-9]/', '_', $name);     return preg_replace('/^[0-9]/', '_', $name);
 } }
-function mangle_superglobals() {+function php_mangle_superglobals() {
     if (version_compare(PHP_VERSION, '8.0.0', '<')) {     if (version_compare(PHP_VERSION, '8.0.0', '<')) {
         return;         return;
     }     }
     foreach ($_ENV as $var => &$val) {     foreach ($_ENV as $var => &$val) {
-        $mangled = mangle_name($var);+        $mangled = php_mangle_name($var);
         if ($mangled !== $var) {         if ($mangled !== $var) {
             $_ENV[$mangled] =& $val;             $_ENV[$mangled] =& $val;
Line 175: Line 178:
 <code> <code>
 $ composer require php/mangle-superglobals ^1.0 $ composer require php/mangle-superglobals ^1.0
-$ cat app/boostrap.php+$ cat app/bootstrap.php
 <?php <?php
 require __DIR__ . '/vendor/autoload.php'; require __DIR__ . '/vendor/autoload.php';
  
-mangle_superglobals();+php_mangle_superglobals();
  
 // ... // ...
Line 185: Line 188:
  
 ===== Proposed PHP Version(s) ===== ===== Proposed PHP Version(s) =====
-PHP 7.1 (for notice of impending BC break) and PHP 8.0 (for actual implementation and corresponding BC break).+PHP 8.0.
  
 ===== RFC Impact ===== ===== RFC Impact =====
Line 204: Line 207:
  
 ===== Open Issues ===== ===== Open Issues =====
-None so far.+None.
  
 ===== Proposed Voting Choices ===== ===== Proposed Voting Choices =====
-A simple yes/no voting option with a 2/3 majority required.+A simple yes/no voting option with a 2/3 majority required: "Remove name mangling in PHP 8.0?"
  
 ===== Patches and Tests ===== ===== Patches and Tests =====
rfc/on_demand_name_mangling.1451938139.txt.gz · Last modified: 2017/09/22 13:28 (external edit)