rfc:rng_extension

This is an old revision of the document!


PHP RFC: Add Random Extension

Introduction

PHP is currently having problems with RNG reproducibility.

PHP's RNG has been unified into an implementation using the Mersenne twister, with the rand() and srand() functions becoming aliases for mt_rand() and mt_srand() respectively in PHP 7.1.

But, these functions still store the state in the global state of PHP and are not easily reproducible. Look at the following example.

echo foo(1234, function (): void {}) . PHP_EOL; // Result: 1480009472
echo foo(1234, function (): void { mt_rand(); }) . PHP_EOL; // Result: 1747253290
 
function foo(int $seed, callable $bar): int {
    mt_srand($seed);
    $result = mt_rand();
    $bar();
    $result += mt_rand();
    return $result;
}

As mentioned above, the reproducibility of random numbers can easily be lost if additional processing is added later.

In addition, the fiber extension was introduced in PHP 8.1. This makes it more difficult to keep track of the execution order. However, this problem has existed since the introduced of Generator.

There is also the problem of functions that implicitly use the state stored in PHP's global state. shuffle(), str_shuffle(), and array_rand() functions implicitly advance the state of a random number. This means that the following code is not reproducible, but it is difficult for the user to notice this.

mt_srand(1234);
echo mt_rand() . PHP_EOL; // Result: 411284887
 
mt_srand(1234);
str_shuffle('foobar');
echo mt_rand() . PHP_EOL; // Result: 1314500282

Proposal

Implement and bundled Random extension into PHP.

The phpstub for the whole extension is as follows:

<?php
 
/** @generate-class-entries */
/** @generate-function-entries */
 
namespace Random
{
    interface NumberGenerator
    {
        public function generate(): int;
    }
}
 
namespace Random\NumberGenerator
{
    class XorShift128Plus implements Random\NumberGenerator
    {
        public function __construct(?int $seed = null) {}
 
        public function generate(): int {}
 
        public function __serialize(): array {}
 
        public function __unserialize(array $data): void {}
    }
 
    class MT19937 implements Random\NumberGenerator
    {
        /** @implementation-alias Random\NumberGenerator\XorShift128Plus::__construct */
        public function __construct(?int $seed = null) {}
 
        /** @implementation-alias Random\NumberGenerator\XorShift128Plus::generate */
        public function generate(): int {}
 
        /** @implementation-alias Random\NumberGenerator\XorShift128Plus::__serialize */
        public function __serialize(): array {}
 
        /** @implementation-alias Random\NumberGenerator\XorShift128Plus::__unserialize */
        public function __unserialize(array $data): void {}
    }
 
    class Secure implements Random\NumberGenerator
    {
        public function __construct() {}
 
        /** @implementation-alias Random\NumberGenerator\XorShift128Plus::generate */
        public function generate(): int {}
    }
}
 
namespace
{
    final class Random
    {
        // FIXME: stub generator (gen_stub.php) does not supported.
        // private Random\NumberGenerator\RandomNumberGenerator $rng;
        private mixed $rng;
 
        public function __construct(?Random\NumberGenerator $rng = null) {}
        public function getNumberGenerator(): Random\NumberGenerator {}
        public function nextInt(): int {}
        public function getInt(int $min, int $max): int {}
        public function getBytes(int $length): string {}
        public function shuffleArray(array $array): array {}
        public function shuffleString(string $string): string {}
        public function __serialize(): array {}
        public function __unserialize(array $data): void {}
    }
}

Each RNG is implemented as a class in the Random\NumberGenerator namespace. They all implement the Random\NumberGenerator interface.

The bundled RNGs are as follows:

  • Random\NumberGenerator\XorShift128Plus: 64-bit, reproducible, PRNG.
  • Random\NumberGenerator\MT19937: 32-bit, reproducible, PRNG, compatible mt_srand() / mt_rand().
  • Random\NumberGenerator\Secure: 64-bit, non-reproducible, CSPRNG, uses php_random_bytes() internally.

Random class use a XorShift128+ by default. It can generate 64-bit values, is used by major browsers, and is fast and reliable. However, when used XorShift128+ in a 32-bit environment, the upper 32 bits are always truncated. This means that compatibility cannot be maintained between platforms, but this is not a problem since most platforms running PHP today are 64-bit and MT19937 can be used explicitly if compatibility is required.

Note that (new Random(new Random\NumberGenerator\MT19937($seed)))->nextInt() requires an additional bit shift to get a result equivalent to mt_rand(). mt_rand() implicitly did the bit-shifting internally, but there was no obvious reason for this.

Secure is practically equivalent to random_int() and random_bytes(), This is useful when secure array or string shuffling is required.

This class also supports RNGs defined in userland. It can be used by passing an instance of a class that implements the RandomNumberGenerator interface provided at the same time as the first argument.This is useful for unit testing or when you want to use a fixed number.

class UserDefinedRNG implements Random\NumberGenerator
{
    protected int $current = 0;
 
    public function generate(): int
    {
        return ++$this->current;
    }
}
 
function foobar(Random $random): void {
    for ($i = 0; $i < 9; $i++) {
        echo $random->nextInt();
    }
}
 
foobar(new Random(new UserDefinedRNG())); // Results: 123456789

Also, as with MT, various alternative APIs using Random class will be provided.

/* similar php_mt_rand() */
uint64_t php_random_next(php_random *php_random, bool shift);
 
/* similar php_mt_rand_range() */
zend_long php_random_range(php_random *php_random, zend_long min, zend_long max);
 
/* similar php_array_data_shuffle() */
void php_random_array_data_shuffle(php_random *php_random, zval *array);
 
/* similar php_string_shuffle() */
void php_random_string_shuffle(php_random *php_random, char *str, zend_long len);

The Random class can be serialized using the standard PHP serialization mechanism. But, if the $rng member is not serializable, it will throws Exception.

// serialize
$foo = new Random(new Random\Numbergenerator\XorShift128Plus());
for ($i = 0; $i < 10; $i++) { $foo->nextInt(); }
var_dump(unserialize(serialize($foo))->nextInt() === $foo->nextInt()); // true
 
// can't serialize
$foo = new Random(new Random\Numbergenerator\Secure());
for ($i = 0; $i < 10; $i++) { $foo->nextInt(); }
var_dump(unserialize(serialize($foo))->nextInt() === $foo->nextInt()); // throws Exception:  Serialization of CLASS is not allowed.

It is not possible to clone the Random class. it always throws Error (Error: Trying to clone an uncloneable object of class Random). This is because the standard PHP clone method copies the members by reference when cloning. This will be an unintended behavior for most users. Instead, you can use the getNumberGenerator() method to retrieve the internal RNG instance. The RNG instance can be cloned.

$foo = new Random();
 
// can't direct clone
// $bar = clone $foo;
 
// safe
$bar = new Random(clone $foo->getNumberGenerator());

Using this feature, the first example can be rewritten as follows:

echo foo(1234, function (): void {}) . PHP_EOL; // Result: 1480009472
echo foo(1234, function (): void { mt_rand(); }) . PHP_EOL; // Result: 1480009472
 
function foo(int $seed, callable $bar): int {
    $random = new Random(new Random\NumberGenerator\MT19937($seed));
    $result = ($random->nextInt() >> 1); // requires bit-shift for compatibility.
    $bar();
    $result += ($random->nextInt() >> 1); // requires bit-shift for compatibility.
    return $result;
}

Future Scope

This RFC will be the basis for making PHP RNGs safe in the future.

By first accepted this RFC, PHP gets a random number in the local scope.

The Random class can also be used when new features are implemented that use random numbers. This has the effect of discouraging more implementations from using random numbers that depend on the global scope.

More in the future, we can consider doing away with functions such as mt_srand(). These functions are simple and convenient, but they may unintentionally create implementations that depend on global scope.

Backward Incompatible Changes

The following class name will no longer be available:

  • “Random”
  • “Random\NumberGenerator”
  • “Random\NumberGenerator\XorShift128Plus”
  • “Random\NumberGenerator\MT19937”
  • “Random\NumberGenerator\Secure”

Proposed PHP Version(s)

8.1

FAQ

RFC Impact

To SAPIs

none

To Existing Extensions

none

To Opcache

none

New Constants

none

php.ini Defaults

none

Open Issues

none

Vote

Voting opens 2021-MM-DD and 2021-MM-DD at 00:00:00 EDT. 2/3 required to accept.

Add Random class
Real name Yes No
Final result: 0 0
This poll has been closed.

Patches and Tests

rfc/rng_extension.1625581108.txt.gz · Last modified: 2021/07/06 14:18 by zeriyoshi