PHP RFC: Serializable closures, by reference to the code that declares them

PHP RFC: Serializable closures, by reference to the code that declares them

Version: 0.3
Date: 2026-06-10
Author: Nicolas Grekas nicolasgrekas@php.net
Status: Under Discussion
First Published at: https://wiki.php.net/rfc/serializable_closures
Mailing List Thread: https://news-web.php.net/php.internals/131198

Introduction

PHP 8.5 allowed closures inside constant expressions: anonymous static function () {} and first-class callables foo(...), in attribute arguments and parameter default values (Closures in constant expressions, First-class callables in constant expressions). By design they capture nothing, are never bound and are never rebound, so such a closure is fully described by where it is declared.

serialize() nonetheless refuses every closure, and that breaks a common setup. Metadata systems read a class's attributes once, turn them into an optimized representation and cache it through serialize() into a PSR-6 pool: Doctrine for entity mappings, the Symfony Validator for its constraint metadata, the Serializer for its class metadata. An attribute that carries a closure makes that representation unserializable:

class Order
{
    #[Assert\Callback(static function (string $value, ExecutionContextInterface $context) {
        // ...custom validation...
    })]
    public string $billingAddress;
}

serialize() throws on the cached metadata for Order, the cache layer catches it, and the metadata is rebuilt from reflection on every request. Nothing fails loudly; the application only gets slower. This is symfony#63228, and it is not a framework bug: it hits any code that caches metadata derived from closure-carrying attributes, which was a main use case of the 8.5 RFCs.

The problem is generic, and serialize() is the generic way caches store data. Today a framework can only work around it by wrapping every closure in a small object that remembers “attribute X, argument Y of class Z” and re-runs reflection on first use. That works, but that wrapper is a dependency the application must carry, and every framework builds and maintains its own version of one simple fact that the engine already knows: this closure is the one declared right there. That machinery already exists in userland: the ext-deepclone extension and a pure-PHP polyfill already do this, each carrying its own implementation of that one fact. The engine is the only layer that can teach serialize() about these closures once, for the whole ecosystem, with no added dependency, and let frameworks drop that code.

This RFC makes such closures serializable as a reference to their declaration site. unserialize() re-evaluates that declaration and returns an equivalent closure. No code travels in the payload, and a payload can only ever name a closure that the reader's own classes already declare. Closures with state, and closures created at runtime, stay non-serializable, with the same error as today.

Why in the engine

The extension and polyfill prove the design works and is wanted, but they cannot be where it ends:

serialize() is the contract. Caches store data with serialize(), which still throws on a closure. The userland tools only help code that adopts their deepclone_to_array() API; making serialize($closure) itself work reaches every existing cache with no adoption, and only the engine can do that.
Only the engine holds some of the facts. The code hash that catches an attribute reordering is computed at compile time and is gone once the source is freed, so a userland producer can only resolve by position, with no staleness check. The declaring class of a cross-class or global first-class callable is discarded too: the extension recovers it only by mirroring private engine structs, and the polyfill cannot recover it at all.
They are not free, nor always installable. A C extension is not an option on every deployment; the polyfill pays reflection and reads source files to tell same-line closures apart. The engine computes the reference directly and stores the hash in opcache at no cost.

Doing it once in the engine replaces this duplicated, partly fragile machinery with a single canonical implementation that the tools above can defer to.

Proposal

A closure declared in a class's constant expressions serializes to its declaration site: the declaring class, the element that holds it, and its place in that element.

$closure = (new ReflectionProperty(Order::class, 'billingAddress'))
    ->getAttributes()[0]->getArguments()[0];
 
unserialize(serialize($closure))();   // behaves exactly like $closure()
 
$closure->__serialize();
// [ [], ['const-expr', ['Order', '$billingAddress', 0, 1244831918]] ]

Unserializing autoloads the class and recreates the closure as if its constant expression had just been evaluated: same code, statically scoped to the declaring class (self::, private access and static:: all behave identically), no bound $this, fresh static variables, a new instance.

Two forms are covered:

Anonymous closures (static function () {}): the key is the closure's position within its declaring element, guarded by a hash of its code.
First-class callables (strlen(...), self::isStrict(...), Validators::check(...)): the key is the callable's name, with no hash since a name does not change on its own. Late static binding is kept and is visible in the name (Order::isStrict, or the parent class for a parent:: reference). Any visibility works, because resolving re-runs the declaration's own access checks: a private helper used from an attribute of its own class round-trips.

Serializable positions: attribute arguments (of the class, its constants, properties, property hooks, methods and parameters) and parameter default values, including closures nested in arrays or in further closures declared there. Class constant values and property default values are excluded: the engine frees those initializer expressions after first evaluation, so their sites are not reliably addressable (see Future Scope).

Stays non-serializable, unchanged, with the current Serialization of 'Closure' is not allowed: any closure that carries state or is created at runtime.

Closure kind	Why
Anonymous and arrow functions declared in function bodies	Capture variables and `$this`
Named callables created at runtime (`foo(...)` in a body, `Closure::fromCallable()`, `ReflectionMethod::getClosure()`)	No declaration site; by name they would make `unserialize()` a factory over the whole code base
Bound or rebound closures, closures made from `__call()`/`__callStatic()`	Carry object state, or have no real declaration behind them
Const-expr closures in class constant values, property defaults, or attributes of free functions / anonymous classes	Site not addressable (see Future Scope)

The boundary is declaration, not syntax: self::check(...) serializes in an attribute and refuses in a method body, because only the former is something the class declares about itself.

The reference

The payload is a tagged union [ <object properties>, [ <tag>, <reference> ] ], matching ext/uri and ext/random: a Closure has no properties, the “const-expr” tag names the reference kind, and an older runtime rejects an unknown tag cleanly. The reference itself is four fields, [class, site, key, hash], derived from the compiled class alone, so every process running the same source computes the same reference, with or without opcache.

site names the declaring element: “” for the class, a constant or enum-case name, $prop for a property, $prop::get() / $prop::set() for a hook, name() for a method. key is the closure's position in that element (an integer) for an anonymous closure, or the callable's name for a first-class callable. Because the reference belongs to one element, editing one element never changes the numbering of another, and resolution walks only the named element.

hash guards an anonymous position: a hash of the closure's code computed at compile time (it ignores whitespace and comments, and opcache stores it at no extra cost). A position alone can come to point at a different closure after an ordinary edit, for example reordering two attributes:

#[A(static fn () => 1)]   // ['$p', 0]
#[B(static fn () => 2)]   // ['$p', 1]   -- swap the two, and ['$p', 0] now names B's closure
public string $p;

On resolution the hash is checked; if it does not match, unserialize() throws instead of returning the wrong closure. A first-class callable needs no hash (its name does not change on its own) and carries 0; so does a producer that cannot compute one (a PHP 8.5 polyfill has no engine access), which then resolves by position.

A stale, unknown or malformed reference throws an Exception on unserialize(), which cache layers already treat as a miss. References are cache data: they are valid for the code version and PHP version that produced them, like the metadata they are stored with. For a reference that must survive across deploys, name the closure: a first-class callable is keyed by its method name and follows that method across edits.

Security

unserialize() here is a lookup into the fixed set of closures that a class declares about itself, not a way to build any closure from the code base. A forged payload can at most swap one declared closure for another: the same kind of risk as an unserialized enum-case name, and much less than what unserialize() already allows when it rebuilds objects and runs their __wakeup(). allowed_classes gates it at both ends: without Closure in the list the payload becomes __PHP_Incomplete_Class, and resolving a reference takes a closure from its declaring class, which must be listed too.

This limit is the reason first-class callables serialize only when declared in a constant expression, never from a bare name. If serialize(strlen(...)) worked anywhere, a payload could name system and become a ready-made way to call any function, against any application that unserializes untrusted data and then calls the result.

Reflection and var_export()

No new reflection API is needed. __serialize() working is the “declared in a constant expression?” test, and its payload already gives the declaring class, which for a cross-class first-class callable is neither the scope nor the called class that any current accessor returns. var_export()-based caches (opcache-compiled PHP files) recreate a closure by embedding the serialize() payload:

return \unserialize('O:7:"Closure":2:{i:0;a:0:{}i:1;a:2:{i:0;s:10:"const-expr";i:1;a:4:{i:0;s:9:"App\\Order";i:1;s:15:"$billingAddress";i:2;i:0;i:3;i:1244831918;}}}');

for anonymous and non-public references, and plain \Foo::bar(...) for public callables, as Symfony's VarExporter already does (symfony#61657). Closure gains __serialize()/__unserialize() as regular methods; var_export() and json_encode() are unchanged.

Design notes

Why this reference. Scoping the reference to one element keeps resolution fast (it walks only that element) and keeps unrelated edits from changing it. It is derived from the compiled class, so userland can reproduce it without the engine, and a PHP 8.5 polyfill creates and resolves the same references, byte for byte, for the closures it can address (tested against the engine, the ext-deepclone extension and a pure-PHP polyfill). The engine computes it when needed instead of storing a table in every compiled class, which would cost memory in opcache; the result is the same either way.
The hash is a staleness check, not a security control. What a payload can name is limited by resolution alone and does not depend on the hash; a producer that cannot compute it simply resolves by position.
One RFC for both forms. Anonymous closures and first-class callables share the payload, the resolution and the security model, and the callable half alone would not fix the anonymous case that motivates the RFC.

Backward Incompatible Changes

No syntax or runtime behavior changes for code that does not serialize closures. Three observable changes:

serialize() now works where it threw before, for closures declared in a class's constant expressions. Code that used the exception to detect closures changes behavior for those; every closure with state or created at runtime still refuses.
unserialize() now accepts O:7:“Closure”:... payloads (previously “Unserialization of 'Closure' is not allowed”). Consumers passing allowed_classes are unaffected unless they list Closure.
method_exists($closure, '__serialize') now returns true.

Proposed PHP Version(s)

Next minor version (PHP 8.6).

RFC Impact

opcache / reflection / standard / session: no changes required. serialize(), unserialize() and allowed_classes pick the feature up through the regular __serialize()/__unserialize() protocol; references resolve identically with and without opcache and its file cache.
Ecosystem: metadata caches (Symfony, Doctrine, API Platform, PSR-6 marshallers) start working with closure-carrying attributes with no code changes on the serialize() path. External serializers that support __serialize() (igbinary, msgpack) pick it up on their own. This RFC does not cover closures with state (captured variables, bound $this).

Future Scope

Class constant values and property defaults: serializing const V = static function () {...}; requires the site to stay addressable after its initializer is evaluated in place, either by retaining the expression or by having the compiler register the closure in the class. An engine change deserving its own RFC.
Global constants and free-function attributes: addressable in principle by name, but functions and constants are not autoloadable, which weakens resolution.
Name-based serialization of runtime-created named closures: making serialize(strlen(...)) work anywhere. A pure widening the two payload kinds can coexist with, possibly behind an opt-in flag, deliberately left out here.
Native var_export() support: emitting the unserialize(...) / \Foo::bar(...) recreation from var_export() itself. The callable half needs no RFC and can land independently.

Proposed Voting Choices

Voting opens YYYY-MM-DD and closes YYYY-MM-DD. A single vote, 2/3 majority.

Make closures declared in constant expressions serializable as references to their declaration site, as proposed?
Real name	Yes	No
Final result:	0	0
This poll has been closed.

Patches and Tests

Implementation: https://github.com/php/php-src/pull/22716

Tests under Zend/tests/closures/closure_const_expr/ cover: round-trips from every attribute target and parameter default, nested and runtime-nested declarations; first-class callable sites (functions, own private/protected methods, inherited self::/parent:: with distinct bindings, cross-class references, duplicate references); refusals for every row of the table above; forged and stale payloads (unknown class or site, hash mismatch, resolution without a hash, a key naming a callable the site never declares such as “system”); allowed_classes gating; shared instances in an object graph and __wakeup() ordering; the same behavior with opcache (memory and file cache) and under JIT.

References

Changelog

0.3: Reduced the public surface to serialize()/unserialize() through the regular __serialize()/__unserialize() protocol, dropping the earlier reflection/reconstruction API (Closure::fromConstExpr(), ReflectionFunction::getConstExprId()/getConstExprClass()): the reference and the declaring class already travel in the payload, so it becomes a reproducible detail of it rather than API. With the id no longer exposed, laid the reference out as four fields [class, site, key, hash] rather than packing the site, position and hash into one “<site>@<key>#<hash>” string. Shortened the document and led with the generic-caching motivation.
0.2: Gave the id to the reflection element (“<site>@<key>”); first-class callable references keyed by the callable's name (no position, no hash) and normalized to their first declaring element; anonymous references verified by a code hash inside the id (“<site>@<rank>#<hash>”, optional for producers that cannot compute it); tagged-union payload.
0.1: Initial draft.

Table of Contents