rfc:object_scope_prng
Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
rfc:object_scope_prng [2021/01/17 09:07] zeriyoshi remove unusable open issues |
rfc:object_scope_prng [2021/04/01 06:27] zeriyoshi fix vote |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== PHP RFC: Object scoped RNG Implementations. ====== | ====== PHP RFC: Object scoped RNG Implementations. ====== | ||
- | * Version: | + | * Version: |
- | * Date: 2021-01-17 | + | * Date: 2020-12-20 |
* Author: Go Kudo < | * Author: Go Kudo < | ||
- | * Status: | + | * Status: |
* Implementation: | * Implementation: | ||
* First Published at: https:// | * First Published at: https:// | ||
===== Introduction ===== | ===== Introduction ===== | ||
+ | PHP currently implements an RNG based on the Mersenne Twister, which can be used with mt_srand() and mt_rand(). | ||
- | PHP currently provides the mt_srand() | + | <code php> |
- | However, since these functions keep their state in global space, unintended function calls may cause inconsistency even for the same seed value. | + | \mt_srand(1234); // Seeding internal MT state. |
+ | \mt_rand(); // Generate random number. | ||
+ | </ | ||
+ | |||
+ | The shuffle(), str_shuffle(), and array_rand() functions are also available. | ||
<code php> | <code php> | ||
- | mt_srand(1234); | + | $array = [0 => ' |
- | foo(); | + | \shuffle($array); |
- | mt_rand() === 411284887; // false | + | \str_shuffle(' |
+ | \array_rand($array, | ||
+ | </ | ||
+ | |||
+ | However, the current implementation of the Mersenne Twister keeps the state in the global scope, which can easily be unrepeatable with additional code. | ||
+ | |||
+ | <code php> | ||
+ | \mt_srand(1234); | ||
+ | \str_shuffle(' | ||
+ | |||
+ | \mt_srand(1234); | ||
+ | example_of_additional_code(); | ||
+ | \str_shuffle(' | ||
- | function | + | function |
- | mt_rand(); // code added | + | |
} | } | ||
</ | </ | ||
- | This is inappropriate | + | This is fine if you want to get a completely random result, but it is inconsistent with the existence of the mt_srand() function, which accepts arbitrary seed values. Reproducible results are necessary |
- | These global states also affect the forked child processes. In fact, the EasySwoole Swoole extension based framework has been alerted to this behavior. [5] | + | |
- | In other languages, RNGs are implemented as objects, so this problem doesn' | + | On the other hand, if you want to get a true " |
- | The global state of MT is also used by other functions that use random numbers, and this problem | + | The first way to work around this problem |
+ | |||
+ | However, | ||
+ | |||
+ | **PHP 8.1-dev** | ||
+ | <code shell> | ||
+ | $ git clone " | ||
+ | $ time ./php -r ' | ||
+ | |||
+ | real 0m0.745s | ||
+ | user 0m0.744s | ||
+ | sys | ||
+ | </ | ||
+ | |||
+ | **PHP 8.1-dev + OPcache JIT** | ||
+ | <code shell> | ||
+ | $ git clone " | ||
+ | $ time ./php -dzend_extension=" | ||
+ | |||
+ | real 0m0.083s | ||
+ | user 0m0.081s | ||
+ | sys | ||
+ | </ | ||
+ | |||
+ | **RFC Implementation (based PHP 8.1-dev)** | ||
+ | <code shell> | ||
+ | $ time ./php -r '$r = new \RNG\XorShift128Plus(1234); | ||
+ | |||
+ | real 0m0.021s | ||
+ | user 0m0.010s | ||
+ | sys | ||
+ | </ | ||
+ | |||
+ | RFC has been passed and Fiber will be introduced in PHP 8.1. Implementations with unpredictable execution order, such as Fiber, make this problem worse. | ||
+ | For example (with amphp), the following results are difficult for the user to be aware of and are not guaranteed to be consistent in the future. | ||
<code php> | <code php> | ||
+ | use function Amp\defer; | ||
+ | use function Amp\delay; | ||
+ | |||
+ | function task(int $i): void { | ||
+ | global $running; | ||
+ | |||
+ | while ($running) { | ||
+ | delay(250); | ||
+ | echo "${i} : " . mt_rand() . " | ||
+ | } | ||
+ | } | ||
+ | |||
mt_srand(1234); | mt_srand(1234); | ||
- | $arr = [1, 2, 3, 4, 5]; | + | |
- | shuffle($arr); | + | $running |
- | echo print_r($arr, true); // This result is always consistent. | + | |
+ | defer(fn() => task(1)); | ||
+ | defer(fn() => task(2)); | ||
+ | defer(fn() => task(3)); | ||
+ | defer(fn() => task(4)); | ||
+ | |||
+ | delay(1000); | ||
+ | $running = false; | ||
/* | /* | ||
- | Array | + | Result: |
- | ( | + | 1 : 411284887 |
- | [0] => 3 | + | 4 : 1068724585 |
- | [1] => 2 | + | 2 : 1335968403 |
- | [2] => 5 | + | 3 : 1756294682 |
- | [3] => 4 | + | 1 : 940013158 |
- | [4] => 1 | + | 3 : 1314500282 |
- | ) | + | 4 : 1686544716 |
+ | 2 : 1656482812 | ||
+ | 1 : 1674985287 | ||
+ | 2 : 1848274264 | ||
+ | 3 : 585388171 | ||
+ | 4 : 323490420 | ||
+ | 4 : 593702477 | ||
+ | 3 : 426315791 | ||
+ | 2 : 1722007381 | ||
+ | 1 : 1750549071 | ||
*/ | */ | ||
</ | </ | ||
- | Therefore, it may be a good idea to consider deprecating global | + | In addition, problems with having |
- | One currently possible userland solution is to implement the PRNG in pure PHP. There is actually a userland library [1], but it is not fast enough for PHP at the moment, including with JIT (Benchmark results are available at Open Issues.) | + | https://www.easyswoole.com/ |
- | + | ||
- | This implementation will also allow us to support the PHP paradigm, which will become even more complex in the future. | + | |
- | + | ||
- | I have created an extension for PHP to improve these [2]. This could be used to consider what specific goals this RFC is trying to achieve. | + | |
===== Proposal ===== | ===== Proposal ===== | ||
+ | Implements class-based object scoped PRNG into PHP. | ||
- | Implements an object-scoped PRNG in PHP, providing methods equivalent to functions that use RNG results. | + | First, |
- | + | ||
- | First, | + | |
- | + | ||
- | RNG64Interface has a next64() method that returns a 64-bit wide result, but this is not necessary in a 32-bit environment. | + | |
- | In a 64-bit environment, | + | |
- | The next64() | + | Basically, this interface are supposed to be used as arguments to one of the functions. In other words, the next() and next64() |
<code php> | <code php> | ||
Line 71: | Line 141: | ||
interface RNGInterface | interface RNGInterface | ||
{ | { | ||
+ | /** Generates 32bit pseudo random number. */ | ||
public function next(): int; | public function next(): int; | ||
- | } | ||
- | interface RNG64Interface extends RNGInterface | + | |
- | { | + | |
- | | + | |
public function next64(): int; | public function next64(): int; | ||
} | } | ||
</ | </ | ||
- | Next, make some changes | + | The next step is to implement |
+ | |||
+ | These implementations are done in C and are fast. However, by implementing RNGInterface, | ||
<code php> | <code php> | ||
- | function shuffle(array & | + | namespace RNG; |
- | function str_shuffle(string $string, ?RNGInterface | + | |
- | function array_rand(array $array, int $num = 1, ?RNGInterface | + | class XorShift128Plus implements |
- | /** Generates a random number for the specified range using RNG. */ | + | class MT19937 implements |
- | function rng_range(RNGInterface $rng, int $min, int $max): int [} | + | class OS implements |
- | /** Generates a sequence of bytes for a specified range using RNG. */ | + | |
- | function rng_bytes(RNGInterface | + | |
</ | </ | ||
- | Existing | + | Some RNG implementations can serialize state using the standard PHP serialization methods (serialize() and unserialize() function). This is useful for the purpose of storing state. |
- | Finally, we provide an RNG class that implements | + | <code php> |
- | At this time, considering support for XorShift128+ , MT19937 and OSRNG. | + | $rng = new RNG\XorShift128Plus(1234); |
+ | |||
+ | \rng_next($rng); | ||
+ | |||
+ | $serialized_string = \serialize($rng); | ||
+ | |||
+ | $rng2 = \unserialize($serialized_string); | ||
+ | |||
+ | \rng_next($rng) === rng_next($rng2); | ||
+ | </ | ||
+ | |||
+ | For existing functions | ||
<code php> | <code php> | ||
- | namespace | + | function shuffle(array & |
- | final class XorShift128Plus implements | + | function str_shuffle(string $string, ?RNG\RNGInterface |
- | final class MT19937 implements RNGInterface {} // use MT19937 algorithm | + | |
- | final class OSRNG implements | + | function array_rand(array $array, int $num = 1, ?RNG\RNGInterface |
</ | </ | ||
- | ===== Backward Incompatible Changes ===== | + | Finally, implement a function that can generate values from RNGs, such as the mt_rand(), random_bytes() function. If the $unsigned argument is false, it will return the value generated by the RNG as is. |
- | With the provides of new classes, some class names (or namespaces) will no longer be available in userland. | + | <code php> |
+ | function rng_bytes(RNG\RNGInterface $rng, int $length): string {} // simlar random_bytes() | ||
- | Some of the functions that use RNGs will have additional optional | + | function rng_int(RNG\RNGInterface $rng, int $min, int $max): int {} // simlar mt_rand() with optional arguments |
+ | |||
+ | function rng_next(RNG\RNGInterface $rng, bool $unsigned = true): int {} // simlar mt_rand() without optional arguments | ||
+ | |||
+ | /** @throws ValueError */ | ||
+ | function rng_next64(RNG\RNGInterface $rng, bool $unsigned = true): int {} // Generates 64bit random number | ||
+ | </ | ||
+ | |||
+ | The rng_next64() function will throw a ValueError exception if 64-bit integer is not available on the platform. | ||
+ | |||
+ | This is for proper handling | ||
+ | |||
+ | <code php> | ||
+ | // If the second argument is true (default), rng_next() performs bit shifting like mt_rand() and always returns an unsigned integer. | ||
+ | \mt_srand(1234); | ||
+ | $mt_state = new \RNG\MT19937(1234); | ||
+ | \mt_rand() === \rng_next($mt_state); | ||
+ | |||
+ | // If false, it will return the value as is. This is exactly the same result as $rng-> | ||
+ | \mt_rand() === \rng_next($mt_state, | ||
+ | |||
+ | // This is useful if you want to use the numbers generated by the RNG directly. | ||
+ | \rng_next(new \RNG\MT19937(1234), | ||
+ | </ | ||
+ | |||
+ | ===== Backward Incompatible Changes ===== | ||
+ | A new optional argument will be added to the existing | ||
* shuffle() | * shuffle() | ||
Line 125: | Line 231: | ||
==== To Existing Extensions ==== | ==== To Existing Extensions ==== | ||
- | orng [[https:// | + | none |
==== To Opcache ==== | ==== To Opcache ==== | ||
Line 137: | Line 243: | ||
===== Open Issues ===== | ===== Open Issues ===== | ||
+ | === In other languages, what methods do you use to get around this problem? === | ||
+ | Similar to this proposal, a class-based implementation is in place. | ||
- | === With JIT, won't the userland implementation reach a useful speed? === | + | * C# https:// |
- | Comparing the speed of the userland implementation of XorShift128+ and the orng extension. | + | * Java https:// |
- | **PHP 8.0** | + | ===== Unaffected |
- | <code shell> | + | It does not affect any related existing functions. |
- | $ time php -r ' | + | |
- | + | ||
- | real 0m0.441s | + | |
- | user 0m0.429s | + | |
- | sys 0m0.010s | + | |
- | </ | + | |
- | **PHP 8.0 + OPcache JIT** | ||
- | <code shell> | ||
- | $ time php -dopcache.jit_buffer_size=100M -dopcache.enable_cli=1 -r ' | ||
- | |||
- | real 0m0.155s | ||
- | user 0m0.139s | ||
- | sys 0m0.015s | ||
- | </ | ||
- | |||
- | **PHP 8.0 + orng** | ||
- | <code shell> | ||
- | $ time php -r '$r = new \ORNG\XorShift128Plus(1234); | ||
- | |||
- | real 0m0.056s | ||
- | user 0m0.048s | ||
- | sys 0m0.008s | ||
- | </ | ||
- | |||
- | This provides a significant improvement, | ||
- | |||
- | === Why do we need this feature in the core and not in the extension? === | ||
- | In order to use the features related to pseudo-random numbers that PHP currently provides, an understanding of the core is required. If this proposal is implemented, | ||
- | |||
- | ===== Unaffected PHP Functionality ===== | ||
- | It does not affect any related existing functions. (However, in the case of Type II, non-destructive arguments will be added.) | ||
* mt_srand() | * mt_srand() | ||
* mt_rand() | * mt_rand() | ||
- | * shuffle() | ||
- | * str_shuffle() | ||
- | * array_rand() | ||
- | ===== Proposed Voting Choices | + | ===== Vote ===== |
- | Yes/No, requiring | + | Voting opens 2021-04-01 and 2021-04-15 at 00:00:00 EDT. 2/3 required to accept. |
- | There are a few additional options for implementation. | + | <doodle title=" |
- | + | | |
- | <doodle title=" | + | |
- | | + | |
- | | + | |
- | </ | + | |
- | + | ||
- | <doodle title=" | + | |
- | * Top Level (\RNGInterface) | + | |
- | * " | + | |
- | * " | + | |
</ | </ | ||
===== Patches and Tests ===== | ===== Patches and Tests ===== | ||
- | * https:// | + | * https:// |
- | * https:// | + | |
- | + | ||
- | ===== References ===== | + | |
- | * [1] https:// | + | |
- | * [2] https:// | + | |
- | * [3] https:// | + | |
- | * [4] https:// | + | |
- | * [5] https:// | + |
rfc/object_scope_prng.txt · Last modified: 2021/04/14 14:38 by zeriyoshi