rfc:object_scope_prng

This is an old revision of the document!


PHP RFC: Object scoped RNG Implementations.

Introduction

PHP currently implements an RNG based on the Mersenne Twister, which can be used with mt_srand() and mt_rand().

\mt_srand(1234); // Seeding internal MT state.
\mt_rand(); // Generate random number.

The shuffle(), str_shuffle(), and array_rand() functions are also available.

$array = [0 => 'foo', 1 => 'bar', 2 => 'baz'];
\shuffle($array); // Shuffling array using internal MT state.
\str_shuffle('foobar'); // Shuffling string using internal MT state.
\array_rand($array, 1); // Randomize array using internal MT state.

However, the current implementation of the Mersenne Twister keeps the state in the global scope, which can easily be unrepeatable with additional code.

\mt_srand(1234);
\str_shuffle('foobar'); // Result "afroob", always consistent.
 
\mt_srand(1234);
example_of_additional_code();
\str_shuffle('foobar'); // Result "rfoaob"
 
function example_of_additional_code() {
    \mt_rand(); // Unintended changes in internal MT state!
}

This is fine if you want to get a completely random result, but it is inconsistent with the existence of the mt_srand() function, which accepts arbitrary seed values. Reproducible results are necessary for implementations that need to be reproduced, such as game logic or test code.

On the other hand, if you want to get a true “completely” random result, the results of the shuffle(), str_shuffle(), and array_rand() functions are not cryptographically secure. This is due to the use of Mersenne Twister internally.

The first way to work around this problem is to implement RNG in PHP. In fact, such a library exists and is available.

However, the PHP implementation of RNG has execution speed issues. This is also the case in PHP 8.x when JIT is enabled.

PHP 8.1-dev

$ git clone "https://github.com/savvot/random"
$ time ./php -r 'require __DIR__ . "/random/src/AbstractRand.php"; require __DIR__ . "/random/src/XorShiftRand.php"; $r = new Savvot\Random\XorShiftRand(1234); for ($i = 0; $i < 1000000; $i++) { $r->random(); }'
 
real    0m0.745s
user    0m0.744s
sys     0m0.000s

PHP 8.1-dev + OPcache JIT

$ git clone "https://github.com/savvot/random"
$ time ./php -dzend_extension="$(pwd)/../../modules/opcache.so" -dopcache.jit_buffer_size=100M -dopcache.enable_cli=1 -r 'require __DIR__ . "/random/src/AbstractRand.php"; require __DIR__ . "/random/src/XorShiftRand.php"; $r = new Savvot\Random\XorShiftRand(1234); for ($i = 0; $i < 1000000; $i++) { $r->random(); }'
 
real    0m0.083s
user    0m0.081s
sys     0m0.002s

RFC Implementation (based PHP 8.1-dev)

$ time ./php -r '$r = new \RNG\XorShift128Plus(1234); for ($i = 0; $i < 1000000; $i++) { rng_next($r); }'
 
real    0m0.021s
user    0m0.010s
sys     0m0.011s

RFC has been passed and Fiber will be introduced in PHP 8.1. Implementations with unpredictable execution order, such as Fiber, make this problem worse. For example (with amphp), the following results are difficult for the user to be aware of and are not guaranteed to be consistent in the future.

use function Amp\defer;
use function Amp\delay;
 
function task(int $i): void {
    global $running;
 
    while ($running) {
        delay(250);
        echo "${i} : " . mt_rand() . "\n";
    }
}
 
mt_srand(1234);
 
$running = true;
 
defer(fn() => task(1));
defer(fn() => task(2));
defer(fn() => task(3));
defer(fn() => task(4));
 
delay(1000);
$running = false;
 
/*
Result:
1 : 411284887
4 : 1068724585
2 : 1335968403
3 : 1756294682
1 : 940013158
3 : 1314500282
4 : 1686544716
2 : 1656482812
1 : 1674985287
2 : 1848274264
3 : 585388171
4 : 323490420
4 : 593702477
3 : 426315791
2 : 1722007381
1 : 1750549071
*/

In addition, problems with having state in the global scope have been reported in some extensions. For example, the Swoole extension notes that the use of mt_rand() requires reinitialization in the process after forking.

https://www.easyswoole.com/En/Other/random.html

Proposal

Implements class-based object scoped PRNG into PHP.

First, implement the following interface. This is a hypothetical PHP code.'

Basically, this interface are supposed to be used as arguments to one of the functions. In other words, the next() and next64() methods are not intended to be called directly. next64() returns an invalid value in a 32-bit environment.

namespace RNG;
 
interface RNGInterface
{
    /** Generates 32bit pseudo random number. */
    public function next(): int;
 
    /** Genrates 64bit pseudo random nunber */
    public function next64(): int;
}

The next step is to implement the RNG classes that implement this interface. This RFC proposes implementations of XorShift128+, MT19937 (similar to mt_srand()), and OS provided (similar to random_bytes()).

These implementations are done in C and are fast. However, by implementing RNGInterface, we can do the same thing in PHP code.

namespace RNG;
 
class XorShift128Plus implements RNGInterface // Fast modern PRNG.
{
    public function __construct(int $seed) {}
    public function next(): int {}
    public function next64(): int {}
    public function __serialize(): array {}
    public function __unserialize(array $data): void {}
}
 
class MT19937 implements RNGInterface // Completely consistent \mt_srand() and \mt_rand() implementation.
{
    public function __construct(int $seed) {}
    public function next(): int {}
    public function next64(): int {}
    public function __serialize(): array {}
    public function __unserialize(array $data): void {}
}
 
class OS implements RNGInterface // // Cryptographically Secure PRNG.
{
    public function next(): int {}
    public function next64(): int {}
}

Some RNG implementations can serialize state using the standard PHP serialization methods (serialize() and unserialize() function). This is useful for the purpose of storing state.

$rng = new RNG\XorShift128Plus(1234);
 
\rng_next($rng);
 
$serialized_string = \serialize($rng);
 
$rng2 = \unserialize($serialized_string);
 
\rng_next($rng) === rng_next($rng2); // true

For existing functions that use RNGs, make changes to accept these classes as arguments. If an argument is specified, the function will use that RNG instead of the internal MT.

function shuffle(array &$array, ?RNG\RNGInterface $rng = null): bool {}
 
function str_shuffle(string $string, ?RNG\RNGInterface $rng = null): string {}
 
function array_rand(array $array, int $num = 1, ?RNG\RNGInterface $rng = null): int|string|array {}

Finally, implement a function that can generate values from RNGs, such as the mt_rand(), random_bytes() function. If the $unsigned argument is false, it will return the value generated by the RNG as is.

function rng_bytes(RNG\RNGInterface $rng, int $length): string {} // simlar random_bytes()
 
function rng_int(RNG\RNGInterface $rng, int $min, int $max): int {} // simlar mt_rand() with optional arguments
 
function rng_next(RNG\RNGInterface $rng, bool $unsigned = true): int {} // simlar mt_rand() without optional arguments
 
/** @throws ValueError */
function rng_next64(RNG\RNGInterface $rng, bool $unsigned = true): int {} // Generates 64bit random number

The rng_next64() function will throw a ValueError exception if 64-bit integer is not available on the platform.

This is for proper handling of next64() in a 32-bit environment and to improve interoperability with the mt_rand() function. The mt_rand() function performs a bit shift on the result and always returns an unsigned integer.

// If the second argument is true (default), rng_next() performs bit shifting like mt_rand() and always returns an unsigned integer.
\mt_srand(1234);
$mt_state = new \RNG\MT19937(1234);
\mt_rand() === \rng_next($mt_state); // true
 
// If false, it will return the value as is. This is exactly the same result as $rng->next().
\mt_rand() === \rng_next($mt_state, false); // false
 
// This is useful if you want to use the numbers generated by the RNG directly.
\rng_next(new \RNG\MT19937(1234), false) === 822569775; // true

Backward Incompatible Changes

A new optional argument will be added to the existing arguments below. If you are using reflection for these functions, this may cause problems.

  • shuffle()
  • str_shuffle()
  • array_rand()

Proposed PHP Version(s)

8.1

RFC Impact

To SAPIs

none

To Existing Extensions

none

To Opcache

none

New Constants

none

php.ini Defaults

none

Open Issues

In other languages, what methods do you use to get around this problem?

Unaffected PHP Functionality

It does not affect any related existing functions.

  • mt_srand()
  • mt_rand()

Vote

Voting opens 2021-04-01 and 2021-04-15 at 00:00:00 EDT. 2/3 required to accept.

Add object-scoped RNG
Real name Yes No
alec (alec)  
brzuchal (brzuchal)  
crell (crell)  
dharman (dharman)  
galvao (galvao)  
jasny (jasny)  
kalle (kalle)  
kelunik (kelunik)  
kguest (kguest)  
krakjoe (krakjoe)  
levim (levim)  
marandall (marandall)  
mariano (mariano)  
ocramius (ocramius)  
patrickallaert (patrickallaert)  
ramsey (ramsey)  
reywob (reywob)  
santiagolizardo (santiagolizardo)  
seld (seld)  
sergey (sergey)  
svpernova09 (svpernova09)  
trowski (trowski)  
Count: 3 19

Patches and Tests

rfc/object_scope_prng.1617287385.txt.gz · Last modified: 2021/04/01 14:29 by zeriyoshi