extract()
should be able to extract non-conformant keys anyway, because they're accessible with the ${'foo.bar'}
syntax. That, however, is out of scope for this RFC.This is an old revision of the document!
PHP marshals external key-value pairs into super-globals by mangling some disallowed characters to underscores:
# the shell environment variable "a.b" becomes "a_b" inside $_ENV $ /usr/bin/env "a.b=foo" php -d variables_order=E -r 'echo $_ENV["a_b"];' foo # a "[" also mangles to an underscore $ /usr/bin/env "a[b=foo" php -d variables_order=E -r 'echo $_ENV["a_b"];' foo # same mangling rules for $_REQUEST # curiously "$" does not mangle, even though it's not a valid PHP variable name $ cat mangle.phpt --TEST-- How does $_REQUEST handle HTML form variables with unusual names? --GET-- a.b=dot&a$b=dollar&a%20b=space&a[b=bracket --FILE-- <?php print_r($_GET); ?> --EXPECTF-- Array ( [a_b] => bracket [a$b] => dollar ) $ pear run-tests --cgi=/usr/bin/php-cgi mangle.phpt Running 1 tests PASS How does $_REQUEST handle HTML form variables with unusual names?[mangle.phpt] TOTAL TIME: 00:00 1 PASSED TESTS 0 SKIPPED TESTS
Mangling has the undesirable consequence that many external variables may map to one PHP variable. For example, three separate HTML form elements named a.b
, a_b
and a b
will all resolve to a_b
in the corresponding super-global, with the last seen value winning. This leads to user confusion and userland work arounds, not to mention bug reports: #34882 and #42055 for example.
Automatic name mangling supported register_globals
and import_request_variables()
, but those features ended in August 2014. Name mangling isn't required for super-global marshaling, because the associative array nature of super-globals can accommodate any string variable name. So do we need automatic name mangling? Consider this hypothetical new test:
--TEST-- Name mangling logic moved to extract() --GET-- a.b=dot&a$b=dollar&a%20b=space&a[b=bracket --FILE-- <?php extract($_GET, EXTR_MANGLE); print_r(get_defined_vars()); ?> --EXPECTF-- Array ( [_GET] => Array ( [a.b] => dot [a$b] => dollar [a b] => space [a[b] => bracket ) [a_b] => bracket )
In this new implementation, marshaled superglobals are no longer mangled. Instead, the ability to mangle names has moved to extract()
. This has the happy side effect of fixing extract()
bug reports like #70344.1)
This RFC proposes to phase out automatic name mangling, replacing it with on-demand mangling in extract()
:
E_DEPRECATED
warning the first time a variable is mangled. The warning indicates that name mangling on import will be removed in the next major PHP version.extract()
to mangle names, subject to the following additional rules:EXTR_MANGLE
, which converts any character outside the variable documented regex 2) to an _
EXTR_PREFIX_*
constants, prepend that to any resulting mangled nameEXTR_OVERWRITE
and EXTR_SKIP
using the mangled name as the checkThese questions were raised in the mailing list discussion.
No, because we do not know how many instances of mangling may be present and we do not want to flood application logs.
The message intends to provide some warning to application developers when there is known use of name mangling. As such, a single warning when the mangler runs is sufficient to meet this intent.
Nikita Popov suggested:
I would favor the introduction of a new ini setting. E.g. mangle_names=0 disables name mangling, while mangle_names=1 throws a deprecation warning on startup and enables name mangling. mangle_names=0 should be the default. That is essentially disable name mangling, but leave an escape hatch for those people who rely on it (for whatever reason).
An INI setting to disable mangling must be engine-wide (i.e., PHP_INI_SYSTEM
) as its historical effect occurs before userland code runs. Engine-wide settings are tricky because they force conditions across all instances of PHP running in a given SAPI process. In a hosted environment where many unrelated sites share the same engine configuration, it's possible that one site might require mangling while another site requires no-mangling. These two sites could not co-exist. Thus, an INI setting would introduce operational problems for users.
However, there is an “escape hatch”: userland code can emulate engine super-global mangling using the mangle-aware
extract()
. An implementation is given in the Backward Compatibility section.
No, because this would introduce new, unnecessary BC breakage. Instead, extract()
should have the option to emit mangled names.
This proposal introduces backward incompatible changes: any userland code relying on mangled names would have to either (a) change to using original variable names or (b) re-mangle the super-globals with a polyfill. The latter case could be accomplished with code like:
$mangler = function () { // mangle names like before extract($_ENV, EXTR_MANGLE); // push them into env foreach (get_defined_vars() as $var => $val) { if (! array_key_exists($var, $_ENV)) { $_ENV[$var] = $val; } } }; $mangler();
Similar algorithms could be applied to the other super-globals.
To reduce the burden on userland, a polyfill library could be made available to simplify this:
$ composer require php/mangle-polyfill ^1.0 $ cat example.php <?php mangle_superglobals();
PHP 7.1 (for notice of impending BC break) and PHP 8.0 (for actual implementation and corresponding BC break).
No impact.
No impact.
No impact.
None.
None.
None so far.
A simple yes/no voting option with a 2/3 majority required.
None yet. Implementations will follow vote.
TODO: After the project is implemented, this section should contain
None so far.