rfc:arbitrary_static_variable_initializers

PHP RFC: Arbitrary static variable initializers

Proposal

PHP allows declaring static variables in all functions. Static variables outlive the function call and are shared across future execution of the function.

function foo() {
    static $i = 1;
    echo $i++, "\n";
}
 
foo();
// 1
foo();
// 2
foo();
// 3

The right hand side of the assignment static $i = 1; must currently be a constant expression. This means that it can't call functions, use parameters, amongst many other things. This limitation is hard to understand from a user perspective. This RFC suggests lifting this restriction by allowing the static variable initializer to contain arbitrary expressions.

function bar() {
    echo "bar() called\n";
    return 1;
}
 
function foo() {
    static $i = bar();
    echo $i++, "\n";
}
 
foo();
// bar() called
// 1
foo();
// 2
foo();
// 3

Backwards incompatible changes

Redeclaring static variables

Currently, redeclaring static variables is allowed, although the semantics are very questionable.

function foo() {
    static $x = 1;
    var_dump($x);
    static $x = 2;
    var_dump($x);
}
 
foo();
// 2
// 2

The static variable is overridden at compile time, resulting in both statements referring to the same underlying static variable initializer. This is not useful or intuitive. The new implementation is not compatible with this behavior but would instead result in the first initializer to win. Instead of switching from one dubious behavior to another, redeclaring static variables is disallowed in this RFC and results in a compile time error.

function foo() {
    static $x = 1;
    static $x = 2;
}
// Fatal error: Duplicate declaration of static variable $x

ReflectionFunction::getStaticVariables()

ReflectionFunction::getStaticVariables() can be used to inspect a function's static variables and their current values. Currently, PHP automatically evaluates the underlying constant expression and initializes the static variable if the function has never been called. With this RFC this is no longer possible, as static variables may depend on values that are only known at runtime. Instead, the compiler will attempt to resolve the constant expression at compile time. If successful, the value will be embedded in the static variables table. Otherwise it will be initialized to null. After executing the function and assigning to the static variable the contents of the variable will be reflectable through ReflectionFunction::getStaticVariables().

function foo($initialValue) {
    static $x = $initialValue;
}
 
var_dump((new ReflectionFunction('foo'))->getStaticVariables()['x']);
// NULL
foo(1);
var_dump((new ReflectionFunction('foo'))->getStaticVariables()['x']);
// 1
foo(2);
var_dump((new ReflectionFunction('foo'))->getStaticVariables()['x']);
// 1

From the example above, it becomes more obvious why the initializer $initialValue cannot be evaluated before calling the function.

Other semantics

Exceptions during initialization

An initializer might throw an exception. In that case, the static variable remains uninitialized and the initializer will be called again in the next execution.

function bar($throw) {
    echo "bar() called\n";
    if ($throw) throw new Exception();
    return 42;
}
 
function foo($throw) {
    static $x = bar($throw);
}
 
try {
    foo(true);
} catch (Exception) {}
// bar() called
var_dump((new ReflectionFunction('foo'))->getStaticVariables()['x']);
// NULL
 
foo(false);
// bar() called
var_dump((new ReflectionFunction('foo'))->getStaticVariables()['x']);
// int(42)
 
foo(true);
// bar is not called anymore

Destructor

When the static variable declaration overwrites an existing local variable that contains an object with a destructor that throws an exception, the assignment of the static variable is guaranteed to occur before the exception is thrown. This is analogous to assignments to regular variables.

class Foo {
    public function __destruct() {
        throw new Exception();
    }
}
 
function foo($y) {
    $x = new Foo();
    static $x = $y;
}
 
try {
    foo(42);
} catch (Exception) {}
 
var_dump((new ReflectionFunction('foo'))->getStaticVariables()['x']);
// 42

Recursion

The static variable declaration only runs the initializer if the static variable has not been initialized. When the initializer calls the current function recursively this check will be reached before the function has been initialized. This means that the initializer will be called multiple times. Note though that the assignment to the static variable still only happens once. This is a somewhat technical limitation where the opcode needs to release two values that could both execute user code and thus throw exceptions. Not reassigning the value avoids this issue. However, I cannot imagine a useful scenario for recursive static variable initializers, so semantics here are unlikely to matter.

function foo($i) {
    static $x = $i < 3 ? foo($i + 1) : 'Done';
    var_dump($x);
    return $i;
}
 
foo(1);
// string(4) "Done", $i = 3
// string(4) "Done", $i = 2
// string(4) "Done", $i = 1
 
foo(5);
// string(4) "Done", $i = 5, initializer not called

What initializers are known at compile-time?

In the discussion the question arose whether static variables depending on other static variables are known at compile time.

function foo() {
   static $a = 0;
   static $b = $a + 1;
}

The answer is no. In this example it's clear that $a holds the value 0 until the initialization of $b. However, that's not necessarily the case. If $a is modified at any point between the two initializations the initial value of $b also changes.

Here's a quick explanation of how this is implemented: During compilation of static variables the initializer AST is passed to the zend_eval_const_expr function. It traverses the AST and tries to compile-time evaluate all nodes by evaluating their children first and then the node itself if the child nodes were successfully evaluated. If the evaluation fails the nodes stay AST nodes and will again be evaluated at runtime when more information is available (e.g. when class constants are declared). These expressions are currently considered for compile-time constant expression evaluation:

  • Literals (strings, ints, bools, etc)
  • Binary operations
  • Binary comparisons
  • Unary operations
  • Coalesce operator
  • Ternary operator
  • Array access (self::FOO['bar'])
  • Array literals
  • Magic constants (e.g. __FILE__)
  • Global constants (that are known at compile time)
  • Class constants (that are known at compile time)

Vote

Voting starts 2023-03-21 and ends 2023-04-04.

As this is a language change, a 2/3 majority is required.

Allow arbitrary static variable initializers in PHP 8.3?
Real name Yes No
alcaeus (alcaeus)  
colinodell (colinodell)  
cpriest (cpriest)  
crell (crell)  
cschneid (cschneid)  
galvao (galvao)  
girgias (girgias)  
ilutov (ilutov)  
imsop (imsop)  
kocsismate (kocsismate)  
mcmic (mcmic)  
nicolasgrekas (nicolasgrekas)  
nielsdos (nielsdos)  
ocramius (ocramius)  
pierrick (pierrick)  
ramsey (ramsey)  
sebastian (sebastian)  
seld (seld)  
sergey (sergey)  
svpernova09 (svpernova09)  
thekid (thekid)  
theodorejb (theodorejb)  
timwolla (timwolla)  
villfa (villfa)  
weierophinney (weierophinney)  
Final result: 25 0
This poll has been closed.
rfc/arbitrary_static_variable_initializers.txt · Last modified: 2023/05/24 18:29 by ilutov