Table of Contents

PHP RFC: Serializable closures, by reference to the code that declares them

Introduction

Closures have never been serializable in PHP, for good structural reasons: a closure may capture variables by value or by reference, may be bound to an object, and may be rebound at runtime. A serialization payload for such a closure would have to embed code and captured state, which is both semantically murky and a deserialization attack surface.

PHP 8.5 created a class of closures for which none of those reasons apply. The Closures in constant expressions and First-class callables in constant expressions RFCs allow closures in attribute arguments and other constant expressions, with hard restrictions: anonymous closures must be static and cannot use() any variable, and first-class callables can only reference named functions and static methods. Such a closure carries no state at all. It is fully described by where it is declared, the same way an enum case is fully described by its name.

This RFC makes these closures serializable, by storing a reference to their declaration site rather than the code itself. This covers both forms a closure can take in a constant expression:

unserialize() resolves the reference against the loaded code base and recreates an equivalent closure, as if the declaring constant expression had just been evaluated again. No code ever travels in the payload, and a payload cannot reference anything that the reader's own code base does not already declare.

Closures that carry state (captured variables, bound $this) and closures created at runtime (anonymous functions and first-class callables in function bodies) remain non-serializable, with the exact same error as today.

Problem Statement

Attributes now produce closures, and closures poison caches

Attributes are PHP's declarative metadata system, and the ecosystem caches derived metadata aggressively: validator metadata, serializer metadata, DI container definitions, routing tables. These caches are built once and stored through serialize() (PSR-6/PSR-16 pools) or through var_export() (opcache-friendly PHP files).

Since PHP 8.5, attributes can carry closures, and this is not a corner case: it was the headline use case of the 8.5 RFCs. The Symfony validator, for instance, accepts exactly this:

class Order
{
    #[Assert\When(static function () { return self::$strictMode; }, constraints: [new Assert\NotBlank()])]
    public string $billingAddress;
}

The metadata derived from this attribute contains the closure. The cache layer calls serialize() on it, which throws, the framework catches the error, and the metadata is silently recomputed on every request. Nothing fails loudly; the application just gets slower. This is symfony#63228, reported shortly after the PHP 8.5 release, and the same trap exists for every metadata cache in the ecosystem.

The language added a feature whose natural habitat is cached metadata, while its values are hostile to caching. This RFC closes that gap.

Userland cannot fix this well

Userland has two known workarounds, both with significant drawbacks:

The engine is the only layer that can give such references first-class, code-free, validated semantics, uniformly for the whole ecosystem.

Named callables have the same problem with serialize()

#[Assert\When(Foo::isStrict(...))] has the same caching behavior as the inline closure: serialize() refuses fake closures too. The var_export() side of this is already solved in userland (Symfony's VarExporter exports named closures as \Foo::isStrict(...), symfony#61657), which makes the asymmetry worse: the recommended, “cache-friendly” way of writing callable attributes still breaks every serialize()-based cache.

Proposal

Closures become serializable when, and only when, they are declared by a class's constant expressions, in attribute arguments and parameter default values (class constant values and property defaults are excluded for now: see the Rationale). Two forms of declaration exist.

1. Anonymous closures

A closure declared in a constant expression attached to a class member becomes serializable:

The payload is a reference to the declaration site, made of the declaring class name, a stable id, and the start line of the closure (used as a staleness check):

$closure = (new ReflectionProperty(Order::class, 'billingAddress'))
    ->getAttributes()[0]->getArguments()[0];
 
$payload = serialize($closure);
// O:7:"Closure":3:{s:5:"class";s:5:"Order";s:2:"id";i:0;s:4:"line";i:3;}
 
$again = unserialize($payload);
$again();   // behaves exactly like $closure()

Unserializing autoloads the class if needed and recreates the closure as if its constant expression had just been evaluated: same code, statically scoped to its declaring class (self::, private member access and static:: behave identically), no bound $this, and fresh static variables. It is a new Closure instance, the same way two calls to ReflectionAttribute::getArguments() return two instances today.

The id is deterministic. It is the closure's rank among all closure-declaring constant expressions of the class (anonymous declarations and first-class callable references alike), counted in a fixed declaration-order traversal of the class (class attributes first, then constants, properties and hooks, then methods with their parameter lists). It is derived from the compiled class alone, never from runtime state, evaluation order or caches, which is where its stability comes from: every process running the same source computes the same rank, with or without opcache. The flip side is that editing the class may renumber the ranks. References are therefore validated when they resolve (see below) and must be treated like the cache artifacts they are embedded in: valid for the code revision that produced them, and per PHP version. Userland should obtain ids from the engine (see the Reflection API below) rather than compute them.

When the class's source changes, a stored reference either stops resolving or is rejected by the line check; both throw an Exception on unserialize(), which cache layers already treat as a miss.

2. First-class callable references

A first-class callable in a constant expression is a closure declaration site like any other: the class's source declares, at a fixed position, “a closure over this callable”. Closures created by evaluating such a reference serialize with the same payload as anonymous declarations: the declaring class, the id, the start line. The engine tracks this provenance when it evaluates the constant expression; an identical-looking closure created at runtime does not have it and refuses to serialize.

class Order
{
    #[Assert\When(self::isStrict(...), constraints: [new Assert\NotBlank()])]
    public string $billingAddress;
 
    private static function isStrict(): bool { ... }
}
 
$closure = /* from ReflectionAttribute::getArguments() */;
serialize($closure);
// O:7:"Closure":3:{s:5:"class";s:5:"Order";s:2:"id";i:0;s:4:"line";i:3;}

Resolution re-evaluates the declared reference in the scope of the declaring class, exactly like attribute evaluation does. Two important properties follow:

This works uniformly for functions (strlen(...)), own methods (self::isStrict(...)), and cross-class references (Validators::check(...)), in attribute arguments and parameter default values alike.

What stays non-serializable

Everything below keeps today's behavior, including the exact error (Exception: Serialization of 'Closure' is not allowed):

Closure kind Why
Anonymous functions declared in function bodies No declaration site; may capture variables and $this
Arrow functions Capture by value implicitly
Closures over named callables created at runtime (strlen(...) / Foo::bar(...) in function bodies, Closure::fromCallable(), ReflectionMethod::getClosure()) No declaration site; serializing them by name would make unserialize() a closure factory over the whole code base instead of over what classes declare
Closures bound to an object ($obj->method(...), Closure::bind() with $this) Carry object state
Closures created from __call()/__callStatic() trampolines No real backing method; the engine already rejects them in constant expressions anyway
Const-expr closures in class constant values and property default values See Future Scope
Const-expr closures in attributes of free functions, or of anonymous classes No autoloadable / stable container name
Const-expr closures rebound to a different scope No longer the declared value

Note that the boundary is declaration, not syntax: the same expression, anonymous closure or first-class callable, is serializable when it appears in a constant expression and not when it appears in a method body, because only the former is a declaration the class makes about itself, with an identity that survives the process.

Security model

The payload contains a class name and two integers, never code. Resolution can only ever produce a closure that the named class's own source declares in one of its constant expressions: there is nothing else a payload can express. unserialize() is not a closure factory over the code base; it is a lookup into the fixed, finite set of closures that classes declare about themselves.

A forged payload can therefore at worst point at a different declared closure, the same class of risk as unserializing an enum case name or a class name today, and far less than what unserialize() already allows through __wakeup() gadgets. Visibility needs no special rule: resolving a first-class callable site re-runs the same accessibility checks, in the same scope, that attribute evaluation runs, so a reference resolves for the reader exactly when the declaration is legal for the declarer. And the existing allowed_classes hardening applies unchanged: with unserialize($data, ['allowed_classes' => [...]]) not listing Closure, these payloads produce __PHP_Incomplete_Class like any other non-listed object, and no resolution happens at all.

All malformed, unresolvable or stale payloads throw an Exception with a descriptive message, for example:

Invalid serialization data for Closure object (constant-expression closure 3 of class Order not found)
Invalid serialization data for Closure object (cannot load class "Order")

Reflection and exporter support

var_export()-based caches (PHP files compiled by opcache) need to generate code that recreates the closure, rather than a binary payload. Three additions support this:

An exporter then emits self-contained, opcache-friendly code:

// generated cache file
return \Closure::fromConstExpr(\App\Order::class, 0);

For closures over public named callables, exporters can keep emitting plain first-class callable syntax (\Order::isStrict(...)), as Symfony's VarExporter already does: in generated code the expression itself is the reference. fromConstExpr() is what makes the remaining cases exportable, anonymous closures and non-public references, since generated code runs in global scope and could not name a private helper directly. The identity of any named closure (its name, scope, called scope, staticness) remains introspectable through existing reflection regardless of visibility; reflection describes closures already in hand and is deliberately not restricted.

Behavior details

Rationale

Why references instead of embedding source code

The Opis-style alternative (store the closure's source text, compile it on unserialize) was rejected deliberately:

The 8.5 restrictions are precisely what make the reference design possible: since the closure can capture nothing, the reference loses nothing.

Why first-class callables serialize only when declared in constant expressions

An earlier draft of this proposal serialized every closure over a named function or public static method by name, so that serialize(strlen(...)) worked anywhere. That design was dropped for three reasons:

The engine tracks the necessary provenance at no observable cost: closures created while evaluating a constant expression are marked as such, with the declaring class recorded. A value-identical first-class callable created in a function body does not carry the mark and refuses to serialize. This asymmetry is the same one anonymous closures already have (the same body in a method body refuses too), and it is the point: serializability is a property of the declaration, not of the value.

Extending name-based serialization to runtime-created closures over public callables would remain possible later, as a pure widening (see Future Scope); this RFC deliberately does not include it.

Why class constant values and property defaults are excluded for now

const FOO = static function () {...}; and public $cb = Validators::check(...); are constant expressions too, and conceptually they should qualify; both forms of declaration are affected equally. They are excluded because the engine does not reliably retain these initializer expressions once they have been evaluated: depending on configuration, they are evaluated in place and freed. A declaration site that may or may not still exist cannot participate in the id numbering without breaking the contract that references resolve identically in every process and configuration. Making these sites addressable requires the engine to retain them, or to register their closures at compile time (the stored-index alternative discussed above), an engine refactoring with its own trade-offs that is deliberately left as future scope rather than blocking the attribute use case motivating this proposal. Until then, these closures keep failing to serialize exactly as they do today: nothing regresses and the door stays open.

Why this id, and not another addressing scheme

Two natural alternatives were considered for the id.

A compiler-assigned index stored in the class. Instead of deriving the rank when a reference is created or resolved, the compiler would number each constant-expression closure while compiling the class and store the table in the compiled artifact. The observable contract would be identical: a stored index is renumbered by source edits in exactly the same situations as the derived rank, so it is neither more nor less stable, and the staleness tripwire is needed either way. The difference is economics: a stored table costs memory in every compiled class (including in opcache shared memory) and new persistence plumbing, while deriving the rank costs a class traversal per closure serialized or resolved, negligible next to unserialization itself. The proposal therefore specifies the contract (deterministic per source revision and PHP version) and derives the id; switching to a stored index later would be invisible to userland. A stored index does have one distinctive power, noted under Future Scope: by keeping the compiled closures of class constant values and property defaults reachable after their initializers are evaluated, it could lift the in-place-evaluation exclusion without changing how those initializers are evaluated.

The rank among attribute arguments of any type, not only closures. Numbering every argument slot looks simpler but addresses the wrong unit. One argument may declare several closures (an array of callbacks is a single argument), so a within-argument ordinal is still needed; parameter default values are not attribute arguments, so a second numbering domain appears; and the id becomes less stable, since adding or removing any scalar argument before the closure renumbers it, while the closure-only rank is invariant to every edit that does not add, remove or reorder closures themselves. Counting only closures numbers exactly the things being referenced, with the smallest possible sensitivity to unrelated edits.

A fully symbolic variant (“argument callback of the second attribute of property $x”) reads better in payloads but combines the drawbacks: paths must reach into nested arrays and chained closures, attribute and parameter-default sites need different address shapes, an ordinal is still required when one argument declares several closures, and the robustness it buys (references surviving edits to other members of the class) is not actually desirable for cache artifacts, where failing closed on any change to the file is the expected behavior. The flat closure rank plus the line tripwire provides the same safety with a two-integer payload.

Why the line number, and why not a stronger fingerprint

Because the id is positional, an edit to the class can make a stored id designate a different declaration site than the one that was serialized: removing an attribute renumbers every closure after it. That is the one failure that must not be silent. Storing the closure's start line makes essentially every renumbering edit fail loudly: for a stale id to resolve silently, the site that now occupies it would have to sit on the very line recorded in the payload. unserialize() otherwise throws an exception that cache layers experience as a regular miss. The line number has three properties that make it the right tripwire: it is already recorded in the compiled class for every declaration site (functions know their start line, and so do the nodes of constant expressions), resolution never touches the filesystem, and it means the same thing in every PHP build.

Stronger checks were considered, and each pins the wrong thing:

In other words: identity of the site is what the id encodes and what the line check defends; identity of the body is intentionally not part of the contract. The residual blind spot is an edit that renumbers sites while keeping the resolved closure's start line unchanged, e.g. reordering two closures declared on the same line. This falls under the discipline serialized payloads already require today (a payload is only valid for the code revision that produced it), and frameworks invalidate their metadata caches on file changes anyway.

Naming

fromConstExpr / getConstExprId / getConstExprClass follow the “constant expression” terminology established by the 8.5 RFCs. fromConstantExpression (spelled out) and fromDeclarationSite are plausible alternatives; the author has no strong attachment and will follow list feedback.

Backward Incompatible Changes

No syntax or runtime behavior changes for code that does not serialize closures. Three observable changes:

Proposed PHP Version(s)

Next minor version (PHP 8.6).

RFC Impact

To SAPIs

None.

To Existing Extensions

To Ecosystem

Future Scope

Proposed Voting Choices

Voting opens YYYY-MM-DD and closes YYYY-MM-DD. A single vote on the whole proposal, requiring a 2/3 majority.

Make closures declared in constant expressions serializable as references to their declaration site, as proposed?
Real name Yes No
Final result: 0 0
This poll has been closed.

Patches and Tests

Implementation: https://github.com/nicolas-grekas/php-src/pull/4

Tests live under Zend/tests/closures/closure_const_expr/ and cover:

References