rfc:on_demand_name_mangling

This is an old revision of the document!


PHP RFC: On-demand Name Mangling

Introduction

PHP marshals external key-value pairs into super-globals by mangling some disallowed characters to underscores:

# the shell environment variable "a.b" becomes "a_b" inside $_ENV
$ /usr/bin/env "a.b=foo" php -d variables_order=E -r 'echo $_ENV["a_b"];'
foo

# a "[" also mangles to an underscore
$ /usr/bin/env "a[b=foo" php -d variables_order=E -r 'echo $_ENV["a_b"];'
foo

# same mangling rules for $_REQUEST
# curiously "$" does not mangle, even though it's not a valid PHP variable name
$ cat mangle.phpt
--TEST--
How does $_REQUEST handle HTML form variables with unusual names?
--GET--
a.b=dot&a$b=dollar&a%20b=space&a[b=bracket
--FILE--
<?php
print_r($_GET);
?>
--EXPECTF--
Array
(
    [a_b] => bracket
    [a$b] => dollar
)
$ pear run-tests --cgi=/usr/bin/php-cgi mangle.phpt
Running 1 tests
PASS How does $_REQUEST handle HTML form variables with unusual names?[mangle.phpt]
TOTAL TIME: 00:00
1 PASSED TESTS
0 SKIPPED TESTS

Mangling has the undesirable consequence that many external variables may map to one PHP variable. For example, three separate HTML form elements named a.b, a_b and a b will all resolve to a_b in the corresponding super-global, with the last seen value winning. This leads to user confusion and user-land work arounds, not to mention bug reports: #34882 and #42055 for example.

Automatic name mangling supported register_globals and import_request_variables(), but those features ended in August 2014. Name mangling isn't required for super-global marshaling, because the associative array nature of super-globals can accommodate any string variable name. So do we need automatic name mangling?

Yes, but not on import as before. Now, name mangling can be moved when extracting, which is when it's needed. Consider this hypothetical new test:

--TEST--
Name mangling logic moved to extract()
--GET--
a.b=dot&a$b=dollar&a%20b=space&a[b=bracket
--FILE--
<?php
extract($_GET);
print_r(get_defined_vars());
?>
--EXPECTF--
Array
(
    [_GET] => Array
        (
            [a.b] => dot
            [a$b] => dollar
            [a b] => space
            [a[b] => bracket
        )

    [a_b] => bracket
)

In this new paradigm, all name mangling has moved to extract(), while all input variables retain their original form. This has the side effect of fixing extract() bug reports like #70344.

Proposal

This RFC proposes to phase out automatic name mangling, replacing it with on-demand mangling in extract():

  • Next minor release (currently 7.1):
    • Emit an E_DEPRECATED warning for each variable that is mangled. The warning indicates that name mangling on import will be removed in the next major PHP version.
  • Next major release (currently 8.0):
    • Remove all name mangling code in super-global marshalling functions
    • Update extract() to mangle names, subject to the following additional rules:
      • If a prefix is given by any of the EXTR_PREFIX_* constants, prepend that to any resulting mangled name
      • Honor EXTR_OVERWRITE and EXTR_SKIP using the mangled name as the check
      • Mangle the name such that any letter outside the documented regex [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* is converted to an _
      • Emit an E_NOTICE for each mangled name. The notice indicates the original name and the mangled name.

Backward Incompatible Changes

This proposal introduces backward incompatible changes: any userland code relying on mangled names would have to change to using original variable names.

Proposed PHP Version(s)

PHP 7.1 (for notice of impending BC break) and PHP 8.0 (for actual implementation and corresponding BC break).

RFC Impact

To SAPIs

No impact.

To Existing Extensions

No impact.

To Opcache

No impact.

New Constants

None.

php.ini Defaults

None.

Open Issues

None so far.

Proposed Voting Choices

A simple yes/no voting option with a 2/3 majority required.

Patches and Tests

None yet. Implementations will follow vote.

Implementation

TODO: After the project is implemented, this section should contain

  1. the version(s) it was merged to
  2. a link to the git commit(s)
  3. a link to the PHP manual entry for the feature

Rejected Features

None so far.

rfc/on_demand_name_mangling.1451683805.txt.gz · Last modified: 2017/09/22 13:28 (external edit)