PHP RFC: Deprecate json_encode() on classes marked as non-serializable
- Version: 0.11
- Date: 2024-09-05
- Author: Philip Hofstetter, phofstetter@sensational.ch
- Status: Under Discussion
- First Published at: http://wiki.php.net/rfc/deprecate-json_encode-nonserializable
Introduction
PHP internally marks some classes as not fit for serialization using a flag ZEND_ACC_NOT_SERIALIZABLE
which prevents instances of such classes from being serialized using serialize()
.
However, this flag is currently not respected by json_encode()
which is another serialization method built into PHP and which has special support by the language for userland through JsonSerializable. json_encode()
will encode all instances of internal classes as {}
regardless of serializabiliy.
Especially for Generator
this is harmful because converting a pre-computed array
into a lazy iterator using Generator
is a useful and relatively common refactoring which can otherwise be done transparently to a code-base. At that point it's inconvenient that json_encode()
silently encodes Generator
instances as {}
, changing its shame and contents.
Proposal
This RFC proposes to mark calling json_encode()
on most instances of classes marked with ZEND_ACC_NOT_SERIALIZABLE
as deprecated with the longer-term option of throwing an error in the next major version of PHP wich will follow the one this RFC is implemented in.
The flag ZEND_ACC_NOT_SERIALIZABLE
was intended to mark classes as non-serializable because they either represent a temporary local resource (like a file or database handle) which could not possibly be unserialized later on or because serializing them could have large side-effects (in case of Generator
and Iterator
).
The same reasoning applies to json_encode()
which right now doesn't invoke any of the side-effects (good) but also silently encodes any such object instance as {}
.
For temporary resources (file handles, etc.) this is potentially an acceptable behavior, albeit a bit inconsistent to how serialization is handled, but for Generator
, doing this silently is very inconvenient for a developer in the process of converting a code-base from pre-built arrays to generators for either performance or memory consumption reasons.
This can be done mostly transparently to the rest of the code-base, but will require special handling for a potential json_encode()
which will currently silently does the wrong thing and not just skip iterating the generator but will also silently change the shape of the output, potentially breaking API contracts without any notification to the user.
One exception to the rule is anonymous classes which are all marked as ZEND_ACC_NOT_SERIALIZABLE
because then unserializing, their definition will not be present, so they cannot possibly be unserialized again.
In case of json_encode()
though, where no generic unserialize operation is defined anyways, there's no reason for deprecating or forbidding to json_encode()
anonymous classes, unless their parent class is marked as ZEND_ACC_NOT_SERIALIZABLE
where the above reasoning applies again.
Thus, this RFC proposes to continue to permit json_encode()
on anonymous classes unless they extend a class marked as ZEND_ACC_NOT_SERIALIZABLE
.
Other options considered
This RFC proposes a solution that handles all classes block-listed for serialization to create a consistent behavior beteween the two built-into PHP serialization mechanisms.
Other options considered concern themselves with handling just the Generator
case:
- Add a special case to deprecate/disallow JSON encoding of
Generator
, but otherwise not look atZEND_ACC_NOT_SERIALIZABLE
. This would be a proposed fallback option if the backwards compatility concerns are too large to consider. It would complicate the implementation consierably. - Have
json_encode()
consume the generator and recurse as if it was encoding anarray
. While this would probably be the most ergonomic solution for the refactoring case outlined above, given the unforeseeable side-effects generator consumption can have, including endless loops, this is a dangerous operations and whas thus discarded as an option. - Have
json_encode()
encode generators as[]
: This would help the refactoring case by upholding possible API contracts and would more cleanly match the shape of a generator (which is a list after all), but it would also be lying to the calling code because the generator likely won't be empty.
Impacted internal classes
At the time of writing this RFC, the following list of classes (and their subclasses) are affected by this RFC and calling json_encode()
on them will throw a deprecation warning in the future.
Some of those are containers of a sort, where this encoding is especially misleading (aside of Generator
which was the motivator for this RFC, WeakMap
stands out specifically).
Most of the non-serializable classes have no public properties and thus encode as {}
.
Those which currently do have public properties are still mostly meant for internal usage and thus not ideal candidates to json_encode()
them. However, if code wants to explicitly turn any such instances into JSON in light of the deprecation currently proposed, casting such instances into array
before JSON encoding them is a valid workaround.
$a = new SimpleXmlElement('<a><b>3</b><c>foo</c></a>'); echo json_encode($a); // {"b":"3","c":"foo"}, with deprecation warning echo json_enode((array) $a); // {"b":"3","c":"foo"}, no deprecation warning
Classes with public fields appearing in json_encode() output
- CURLFile (has three public properties)
- PDORow (has one public property, queryString)
- PDOStatement (has one public property, queryString)
- SimpleXMLElement (might be useful)
- ReflectionAttribute
- ReflectionClass
- ReflectionClassConstant
- ReflectionConstant
- ReflectionExtension
- ReflectionFiber
- ReflectionFunctionAbstract
- ReflectionGenerator
- ReflectionParameter
- ReflectionProperty
- ReflectionReference
- ReflectionType
- ReflectionZendExtension
Backed by temporary resources
- AddressInfo
- Collator
- CurlHandle
- CurlMultiHandle
- CurlShareHandle
- DOMXPath
- Dba\Connection
- DeflateContext
- Dom\Implementation
- Dom\NamespaceInfo
- Dom\TokenList
- Dom\XPath
- Dom\XMLDocument
- EnchantBroker
- EnchantDictionary
- FFI
- FFI\CData
- FFI\CType
- FTP\Connection
- GdFont
- GdImage
- InflateContext
- IntlBreakIterator
- IntlCalendar
- IntlCodePointBreakIterator
- IntlDateFormatter
- IntlDatePatternGenerator
- IntlIterator
- IntlPartsIterator
- IntlRuleBasedBreakIterator
- IntlTimeZone
- LDAP\Connection
- LDAP\Result
- LDAP\ResultEntry
- MessageFormatter
- NumberFormatter
- Odbc\Connection
- Odbc\Result
- OpenSSLAsymmetricKey
- OpenSSLCertificate
- OpenSSLCertificateSigningRequest
- PDO
- Pdo\Dblib
- Pdo\Firebird
- Pdo\Mysql
- Pdo\Odbc
- Pdo\Pgsql
- Pdo\Sqlite
- PgSql\Connection
- PgSql\Lob
- PgSql\Result
- Random\Engine\Secure
- ResourceBundle
- SQLite3
- SQLite3Result
- SQLite3Stmt
- Shmop
- Soap\Sdl
- Soap\Url
- Socket
- SplFileInfo
- Spoofchecker
- SysvMessageQueue
- SysvSemaphore
- SysvSharedMemory
- Transliterator
- UConverter
- XMLParser
- finfo
- variant
Other
- Closure
- Fiber
- Generator
- InternalIterator
- SensitiveParameterValue
- WeakMap
- WeakReference
Backward Incompatible Changes
Code which accidentally runs json_encode()
over instances of classes marked as non-serializable or over larger structures which contain such instances will cause a deprecation warning to be thrown when before there wasn't.
Given that the encoded output was mostly useless for any consumer of such JSON and given that producing the previous output manually is not hard, it's the belief of this RFC that the deprecation warning provides more value than the current behavior because current invocations of json_encode()
over unserialized classes is likely unintentional (given the current output of json_encode()
).
Proposed PHP Version(s)
PHP 8.5
RFC Impact
To SAPIs
The deprecation warning will bre raised in all SAPIs
To Existing Extensions
None
To Opcache
None
Ran the included test-case in the PR with Opcache enabled and got the expected result.
New Constants
None
Open Issues
None
Unaffected PHP Functionality
Any other argument to json_encode()
is unaffected
Future Scope
In the next major version after this RFC passes, the deprecation warning can be changed to an Error
, though this will be part of a separate RFC.
Proposed Voting Choices
Should calling json_encode()
on instances of classes marked with ZEND_ACC_NOT_SERIALIZABLE
be marked as deprecated? Yes, No?