PHP RFC: Object scoped RNG Implementations.
- Version: 2.1 (Implementation: 1.3)
- Date: 2020-12-20
- Author: Go Kudo zeriyoshi@gmail.com
- Status: Declined
- Implementation: https://github.com/php/php-src/pull/6568
- First Published at: https://wiki.php.net/rfc/object_scope_prng
Introduction
PHP currently implements an RNG based on the Mersenne Twister, which can be used with mt_srand() and mt_rand().
\mt_srand(1234); // Seeding internal MT state. \mt_rand(); // Generate random number.
The shuffle(), str_shuffle(), and array_rand() functions are also available.
$array = [0 => 'foo', 1 => 'bar', 2 => 'baz']; \shuffle($array); // Shuffling array using internal MT state. \str_shuffle('foobar'); // Shuffling string using internal MT state. \array_rand($array, 1); // Randomize array using internal MT state.
However, the current implementation of the Mersenne Twister keeps the state in the global scope, which can easily be unrepeatable with additional code.
\mt_srand(1234); \str_shuffle('foobar'); // Result "afroob", always consistent. \mt_srand(1234); example_of_additional_code(); \str_shuffle('foobar'); // Result "rfoaob" function example_of_additional_code() { \mt_rand(); // Unintended changes in internal MT state! }
This is fine if you want to get a completely random result, but it is inconsistent with the existence of the mt_srand() function, which accepts arbitrary seed values. Reproducible results are necessary for implementations that need to be reproduced, such as game logic or test code.
On the other hand, if you want to get a true “completely” random result, the results of the shuffle(), str_shuffle(), and array_rand() functions are not cryptographically secure. This is due to the use of Mersenne Twister internally.
The first way to work around this problem is to implement RNG in PHP. In fact, such a library exists and is available.
However, the PHP implementation of RNG has execution speed issues. This is also the case in PHP 8.x when JIT is enabled.
PHP 8.1-dev
$ git clone "https://github.com/savvot/random" $ time ./php -r 'require __DIR__ . "/random/src/AbstractRand.php"; require __DIR__ . "/random/src/XorShiftRand.php"; $r = new Savvot\Random\XorShiftRand(1234); for ($i = 0; $i < 1000000; $i++) { $r->random(); }' real 0m0.745s user 0m0.744s sys 0m0.000s
PHP 8.1-dev + OPcache JIT
$ git clone "https://github.com/savvot/random" $ time ./php -dzend_extension="$(pwd)/../../modules/opcache.so" -dopcache.jit_buffer_size=100M -dopcache.enable_cli=1 -r 'require __DIR__ . "/random/src/AbstractRand.php"; require __DIR__ . "/random/src/XorShiftRand.php"; $r = new Savvot\Random\XorShiftRand(1234); for ($i = 0; $i < 1000000; $i++) { $r->random(); }' real 0m0.083s user 0m0.081s sys 0m0.002s
RFC Implementation (based PHP 8.1-dev)
$ time ./php -r '$r = new \RNG\XorShift128Plus(1234); for ($i = 0; $i < 1000000; $i++) { rng_next($r); }' real 0m0.021s user 0m0.010s sys 0m0.011s
RFC has been passed and Fiber will be introduced in PHP 8.1. Implementations with unpredictable execution order, such as Fiber, make this problem worse. For example (with amphp), the following results are difficult for the user to be aware of and are not guaranteed to be consistent in the future.
use function Amp\defer; use function Amp\delay; function task(int $i): void { global $running; while ($running) { delay(250); echo "${i} : " . mt_rand() . "\n"; } } mt_srand(1234); $running = true; defer(fn() => task(1)); defer(fn() => task(2)); defer(fn() => task(3)); defer(fn() => task(4)); delay(1000); $running = false; /* Result: 1 : 411284887 4 : 1068724585 2 : 1335968403 3 : 1756294682 1 : 940013158 3 : 1314500282 4 : 1686544716 2 : 1656482812 1 : 1674985287 2 : 1848274264 3 : 585388171 4 : 323490420 4 : 593702477 3 : 426315791 2 : 1722007381 1 : 1750549071 */
In addition, problems with having state in the global scope have been reported in some extensions. For example, the Swoole extension notes that the use of mt_rand() requires reinitialization in the process after forking.
Proposal
Implements class-based object scoped PRNG into PHP.
First, implement the following interface. This is a hypothetical PHP code.'
Basically, this interface are supposed to be used as arguments to one of the functions. In other words, the next() and next64() methods are not intended to be called directly. next64() returns an invalid value in a 32-bit environment.
namespace RNG; interface RNGInterface { /** Generates 32bit pseudo random number. */ public function next(): int; /** Genrates 64bit pseudo random nunber */ public function next64(): int; }
The next step is to implement the RNG classes that implement this interface. This RFC proposes implementations of XorShift128+, MT19937 (similar to mt_srand()), and OS provided (similar to random_bytes()).
These implementations are done in C and are fast. However, by implementing RNGInterface, we can do the same thing in PHP code.
namespace RNG; class XorShift128Plus implements RNGInterface // Fast modern PRNG. { public function __construct(int $seed) {} public function next(): int {} public function next64(): int {} public function __serialize(): array {} public function __unserialize(array $data): void {} } class MT19937 implements RNGInterface // Completely consistent \mt_srand() and \mt_rand() implementation. { public function __construct(int $seed) {} public function next(): int {} public function next64(): int {} public function __serialize(): array {} public function __unserialize(array $data): void {} } class OS implements RNGInterface // // Cryptographically Secure PRNG. { public function next(): int {} public function next64(): int {} }
Some RNG implementations can serialize state using the standard PHP serialization methods (serialize() and unserialize() function). This is useful for the purpose of storing state.
$rng = new RNG\XorShift128Plus(1234); \rng_next($rng); $serialized_string = \serialize($rng); $rng2 = \unserialize($serialized_string); \rng_next($rng) === rng_next($rng2); // true
For existing functions that use RNGs, make changes to accept these classes as arguments. If an argument is specified, the function will use that RNG instead of the internal MT.
function shuffle(array &$array, ?RNG\RNGInterface $rng = null): bool {} function str_shuffle(string $string, ?RNG\RNGInterface $rng = null): string {} function array_rand(array $array, int $num = 1, ?RNG\RNGInterface $rng = null): int|string|array {}
Finally, implement a function that can generate values from RNGs, such as the mt_rand(), random_bytes() function. If the $unsigned argument is false, it will return the value generated by the RNG as is.
function rng_bytes(RNG\RNGInterface $rng, int $length): string {} // simlar random_bytes() function rng_int(RNG\RNGInterface $rng, int $min, int $max): int {} // simlar mt_rand() with optional arguments function rng_next(RNG\RNGInterface $rng, bool $unsigned = true): int {} // simlar mt_rand() without optional arguments /** @throws ValueError */ function rng_next64(RNG\RNGInterface $rng, bool $unsigned = true): int {} // Generates 64bit random number
The rng_next64() function will throw a ValueError exception if 64-bit integer is not available on the platform.
This is for proper handling of next64() in a 32-bit environment and to improve interoperability with the mt_rand() function. The mt_rand() function performs a bit shift on the result and always returns an unsigned integer.
// If the second argument is true (default), rng_next() performs bit shifting like mt_rand() and always returns an unsigned integer. \mt_srand(1234); $mt_state = new \RNG\MT19937(1234); \mt_rand() === \rng_next($mt_state); // true // If false, it will return the value as is. This is exactly the same result as $rng->next(). \mt_rand() === \rng_next($mt_state, false); // false // This is useful if you want to use the numbers generated by the RNG directly. \rng_next(new \RNG\MT19937(1234), false) === 822569775; // true
Backward Incompatible Changes
A new optional argument will be added to the existing arguments below. If you are using reflection for these functions, this may cause problems.
- shuffle()
- str_shuffle()
- array_rand()
Proposed PHP Version(s)
8.1
RFC Impact
To SAPIs
none
To Existing Extensions
none
To Opcache
none
New Constants
none
php.ini Defaults
none
Open Issues
In other languages, what methods do you use to get around this problem?
Similar to this proposal, a class-based implementation is in place.
Unaffected PHP Functionality
It does not affect any related existing functions.
- mt_srand()
- mt_rand()
Vote
Voting opens 2021-04-01 and 2021-04-15 at 00:00:00 EDT. 2/3 required to accept.