rfc:deprecations_php_8_3

PHP RFC: Deprecations for PHP 8.3

Introduction

The RFC proposes to deprecate the listed functionality in PHP 8.3 and remove it in PHP 9.

The following list provides a short overview of the functionality targeted for deprecation, while more detailed explanation is provided in the Proposal section:

  • Passing negative $widths to mb_strimwidth()
  • The NumberFormatter::TYPE_CURRENCY constant
  • MT_RAND_PHP
  • Global Mersenne Twister

Proposal

Each feature proposed for deprecation is voted separately and requires a 2/3 majority. All votes refer to deprecation in PHP 8.3 and removal in PHP 9.0.

Passing negative $widths to mb_strimwidth()

Regarding the $width argument of mb_strimwidth(), the PHP manual states: “Negative widths count from the end of the string.” In other words, if $width is -2, then either two halfwidth characters or one fullwidth character should be trimmed from the end of the string.

This feature was introduced in git revision 70187267b4 (January 2016), and was contributed by Francois Laupretre francois@php.net.

Although I have not seen anything written by F. Laupretre to explain what the anticipated usage of the feature was, it seems to have very limited utility. mb_strimwidth() is typically used to trim strings down to a length which can be printed in a terminal without wrapping. It seems very unusual that anyone would want to set the terminal width of a string to “its current value less N”. (One possibility is that the feature was introduced for consistency with other standard library functions which accept a negative argument to indicate “count back from the end of a string”.)

From the time the feature was merged until now, it has always had a bug when combined with a non-zero $from argument. The implementation does arithmetic which combines codepoint counts with terminal width counts, with erroneous results. (This is just like the proverbial “apples and oranges”.) If there are any fullwidth characters in the prefix which is skipped because of non-zero $from, then mb_strimwidth() will not trim the requested width from the end of the string.

It is notable that in 9 years, no user ever noticed and reported this bug. My guess is that almost no-one uses the negative width feature. This would explain why the bug was not noticed.

To implement the feature correctly, without the bug mentioned above, an extra pass over the prefix of the input string identified by $from would be needed to determine its terminal width. This operation requires O(n) time.

When preparing this RFC, I searched the Internet for existing open-source software which uses mb_strimwidth() and reviewed over 100 such projects. None of them used negative $width arguments. However, if readers are aware of existing projects which rely on the negative $width feature, please add that information here.

The NumberFormatter::TYPE_CURRENCY constant

This constant is unused since it was added to PHP (in PHP 5.3 or before), likely it was meant to call NumberFormatter::formatCurrency()/NumberFormatter::parseCurrency() from NumberFormatter::format()/NumberFormatter::parse() respectively. However, this was never implemented, likely because there is no way to pass the necessary currency argument.

MT_RAND_PHP

Authors: Tim Düsterhus timwolla@php.net, Go Kudo zeriyoshi@php.net

The implementation of Mt19937 (“Mersenne Twister”) in PHP versions before 7.1 contains two bugs:

  • mt_rand() without parameters returns different numbers than the reference implementation, due to a typo in a variable name.
  • mt_rand($min, $max) with a restricted range uses a broken scaling algorithm based on floating point arithmetic. As doubles have only 53 Bits of precision, this will introduce a bias if a range larger than 53 Bits is requested.

Both of these issues were fixed in PHP 7.1 by the RNG fixes and changes RFC that also aliased rand() to mt_rand(). The MT_RAND_PHP constant was added to allow developers that rely on a specific sequence for a given seed to opt into the old implementation with the non-standard Mt19937 algorithm and the biased scaler.

The object-oriented Mt19937 engine that was introduced in the Random Extension 5.x RFC in PHP 8.2 also supports the MT_RAND_PHP option to allow developers a smooth migration from the global instance of Mt19937 as used with mt_rand() to the object-oriented Random\Randomizer.

Supporting the biased scaler with the object-oriented API requires a special case for Mt19937 in the internal implementation. It furthermore required a fix (PR 9197), because the biased scaler in PHP 8.1 and earlier exhibits undefined behavior that was only caught by the newly added tests in PHP 8.2. Thus the behavior for certain inputs changed between PHP 8.1 and PHP 8.2 and could not be relied on even in older versions, as the results depend on the compiler used.

The only purpose of the constant/broken mode is backwards compatibility, but this cannot be achieved, due to the bad scaler being broken by design and relying on undefined behavior. As such it fails at its sole purpose. The bad scaling is also intransparent to the developer, as it silently returns biased results for certain inputs.

To clean up the special cases internal implementation and to clean up the API for the developer, MT_RAND_PHP should be deprecated.

Deprecate the broken pre-PHP 7.1 Mt19937 implementation (MT_RAND_PHP)?
Real name Yes No
Final result: 0 0
This poll has been closed.

Global Mersenne Twister

Authors: Tim Düsterhus timwolla@php.net, Go Kudo zeriyoshi@php.net

Before PHP 8.2, PHP provided two kinds of random number generators. A seedable random number generator using the Mt19937 (“Mersenne Twister”) algorithm and a cryptographically secure random number generator (CSPRNG). The former stores its state in an implicit global variable and the latter is not seedable. This made it hard to achieve reproducible results for testing, as the global variable makes it hard to determine what operation modify the Mt19937 state, thus changing future values.

To fix this Random Extension 5.x RFC for PHP 8.2 introduces object-based random number algorithms (“Engines”) that store their entire state within an object. Generating numbers with one object, won't affect the sequence of another object.

The API of the random extension that relates to the generation of random integers currently looks like the following:

  • Global Mt19937
  • CSPRNG
    • random_int()
  • \Random\Randomizer
    • ->getInt()

Thus there are three functions returning a random integer (mt_rand(), rand(), random_int()) and the object-based and pluggable Randomizer::getInt().

The functions using the global Mt19937 instance are the worst choice, they are neither cryptographically secure, nor properly reproducible, but at the same time they are also the easiest to use, as the function names are the shortest ones.

To clean up the API and avoid the “global state” gotcha the global Mt19937 should be removed. The function-based API will then provide just the secure random_int() function which is the safe default choice. If a reproducible sequence is desired, the object-oriented API provides a drop-in replacement for the global Mt19937 using the \Random\Engine\Mt19937 class. Thus the following functions shall be deprecated:

A userland replacement can be written by leveraging $GLOBALS to store an \Random\Randomizer object with a \Random\Engine\Mt19937 engine.

Deprecate the global Mt19937?
Real name Yes No
Final result: 0 0
This poll has been closed.

In addition to the 6 functions related to integer generation, the global Mt19937 is also used for the following functions:

The proposed deprecation and removal of mt_srand() will affect these functions, as they will no longer be seedable, thus a decision needs to be taken with regard to the behavior of these functions, if the global Mt19937 is removed. There are two options:

What to do with the non-integer functions using the global Mt19937 if the previous vote passes?
Real name Deprecate and remove together with mt_srand() Convert to CSPRNG
Final result: 0 0
This poll has been closed.

Backward Incompatible Changes

For PHP 8.3 additional deprecation notices will appear. For PHP 9.0 the previously deprecated functionality will no longer be available.

rfc/deprecations_php_8_3.txt · Last modified: 2022/09/26 17:22 by timwolla