rfc:object_scope_prng

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
rfc:object_scope_prng [2021/01/10 20:36]
zeriyoshi fix rng_between -> rng_range
rfc:object_scope_prng [2021/04/14 14:38]
zeriyoshi status update
Line 1: Line 1:
 ====== PHP RFC: Object scoped RNG Implementations. ====== ====== PHP RFC: Object scoped RNG Implementations. ======
-  * Version: 1.04 +  * Version: 2.1 (Implementation: 1.3) 
-  * Date: 2021-01-11+  * Date: 2020-12-20
   * Author: Go Kudo <zeriyoshi@gmail.com>   * Author: Go Kudo <zeriyoshi@gmail.com>
-  * Status: Under Discussion +  * Status: Declined 
-  * Implementation +  * Implementation: https://github.com/php/php-src/pull/6568
-      * Type I: https://github.com/php/php-src/pull/6568 (outdated) +
-      * Type II: https://github.com/zeriyoshi/php-src/tree/object_scope_rng_type2+
   * First Published at: https://wiki.php.net/rfc/object_scope_prng   * First Published at: https://wiki.php.net/rfc/object_scope_prng
  
 ===== Introduction ===== ===== Introduction =====
 +PHP currently implements an RNG based on the Mersenne Twister, which can be used with mt_srand() and mt_rand().
  
-PHP currently provides the mt_srand() and mt_rand() functions based on the Meresenne Twister as PRNGs+<code php> 
-Howeversince these functions keep their state in global spaceunintended function calls may cause inconsistency even for the same seed value.+\mt_srand(1234); // Seeding internal MT state. 
 +\mt_rand(); // Generate random number
 +</code> 
 + 
 +The shuffle()str_shuffle()and array_rand() functions are also available.
  
 <code php> <code php>
-mt_srand(1234); +$array = [0 => 'foo', 1 => 'bar', 2 => 'baz']; 
-foo(); +\shuffle($array); // Shuffling array using internal MT state. 
-mt_rand() === 411284887; // false+\str_shuffle('foobar'); // Shuffling string using internal MT state. 
 +\array_rand($array, 1); // Randomize array using internal MT state. 
 +</code>
  
-function foo() { +However, the current implementation of the Mersenne Twister keeps the state in the global scope, which can easily be unrepeatable with additional code. 
-    mt_rand(); // code added+ 
 +<code php> 
 +\mt_srand(1234); 
 +\str_shuffle('foobar'); // Result "afroob", always consistent. 
 + 
 +\mt_srand(1234); 
 +example_of_additional_code(); 
 +\str_shuffle('foobar'); // Result "rfoaob" 
 + 
 +function example_of_additional_code() { 
 +    \mt_rand(); // Unintended changes in internal MT state!
 } }
 </code> </code>
  
-This is inappropriate for applications that require consistency in the generated values (game logictest code, etc.)+This is fine if you want to get a completely random result, but it is inconsistent with the existence of the mt_srand() function, which accepts arbitrary seed values. Reproducible results are necessary for implementations that need to be reproduced, such as game logic or test code. 
-These global states also affect the forked child processes. In fact, the EasySwoole Swoole extension based framework has been alerted to this behavior[5]+ 
 +On the other hand, if you want to get a true "completely" random result, the results of the shuffle(), str_shuffle(), and array_rand() functions are not cryptographically secure. This is due to the use of Mersenne Twister internally.
  
-In other languages, RNGs are implemented as objects, so this problem doesn'exists. [3] [4]+The first way to work around this problem is to implement RNG in PHP. In fact, such a library exists and is available.
  
-The global state of MT is also used by other functions that use random numbers, and this problem is further exacerbated when the consistency of the results is required by seeding with specific values. (Of courseyou are right that such usage is a bad example.)+However, the PHP implementation of RNG has execution speed issues. This is also the case in PHP 8.x when JIT is enabled. 
 + 
 +**PHP 8.1-dev** 
 +<code shell> 
 +$ git clone "https://github.com/savvot/random
 +$ time ./php -r 'require __DIR__ . "/random/src/AbstractRand.php"; require __DIR__ . "/random/src/XorShiftRand.php"; $r = new Savvot\Random\XorShiftRand(1234); for ($i = 0; $i < 1000000; $i++) { $r->random(); }' 
 + 
 +real    0m0.745s 
 +user    0m0.744s 
 +sys     0m0.000s 
 +</code> 
 + 
 +**PHP 8.1-dev + OPcache JIT** 
 +<code shell> 
 +$ git clone "https://github.com/savvot/random" 
 +$ time ./php -dzend_extension="$(pwd)/../../modules/opcache.so" -dopcache.jit_buffer_size=100M -dopcache.enable_cli=1 -r 'require __DIR__ . "/random/src/AbstractRand.php"; require __DIR__ . "/random/src/XorShiftRand.php"; $r = new Savvot\Random\XorShiftRand(1234); for ($i = 0; $i < 1000000; $i++) { $r->random(); }' 
 + 
 +real    0m0.083s 
 +user    0m0.081s 
 +sys     0m0.002s 
 +</code> 
 + 
 +**RFC Implementation (based PHP 8.1-dev)** 
 +<code shell> 
 +$ time ./php -r '$r = new \RNG\XorShift128Plus(1234); for ($i = 0; $i < 1000000; $i++) { rng_next($r); }' 
 + 
 +real    0m0.021s 
 +user    0m0.010s 
 +sys     0m0.011s 
 +</code> 
 + 
 +RFC has been passed and Fiber will be introduced in PHP 8.1. Implementations with unpredictable execution order, such as Fiber, make this problem worse. 
 +For example (with amphp)the following results are difficult for the user to be aware of and are not guaranteed to be consistent in the future.
  
 <code php> <code php>
 +use function Amp\defer;
 +use function Amp\delay;
 +
 +function task(int $i): void {
 +    global $running;
 +
 +    while ($running) {
 +        delay(250);
 +        echo "${i} : " . mt_rand() . "\n";
 +    }
 +}
 +
 mt_srand(1234); mt_srand(1234);
-$arr [123, 4, 5]+ 
-shuffle($arr); +$running true; 
-echo print_r($arr, true); // This result is always consistent.+ 
 +defer(fn() => task(1)); 
 +defer(fn() => task(2)); 
 +defer(fn() => task(3))
 +defer(fn() => task(4)); 
 + 
 +delay(1000); 
 +$running = false;
  
 /* /*
-Array +Result: 
-( +1 : 411284887 
-    [0] => +4 : 1068724585 
-    [1] => +2 : 1335968403 
-    [2] => 5 +: 1756294682 
-    [3] => +: 940013158 
-    [4] => 1 +3 : 1314500282 
-)+4 : 1686544716 
 +: 1656482812 
 +1 : 1674985287 
 +: 1848274264 
 +: 585388171 
 +: 323490420 
 +: 593702477 
 +3 : 426315791 
 +2 : 1722007381 
 +1 : 1750549071
 */ */
 </code> </code>
  
-Thereforeit may be a good idea to consider deprecating global state-dependent RNG functions in the future.+In additionproblems with having state in the global scope have been reported in some extensions. For example, the Swoole extension notes that the use of mt_rand() requires reinitialization in the process after forking.
  
-One currently possible userland solution is to implement the PRNG in pure PHPThere is actually a userland library [1], but it is not fast enough for PHP at the moment, including with JIT (Benchmark results are available at Open Issues.+https://www.easyswoole.com/En/Other/random.html
- +
-This implementation will also allow us to support the PHP paradigm, which will become even more complex in the future. +
- +
-I have created an extension for PHP to improve these [2]. This could be used to consider what specific goals this RFC is trying to achieve.+
  
 ===== Proposal ===== ===== Proposal =====
 +Implements class-based object scoped PRNG into PHP.
  
-Implements an object-scoped PRNG in PHPproviding methods equivalent to functions that use RNG results. +Firstimplement the following interface. This is a hypothetical PHP code.'
- +
-Two methods have been proposed in the discussion. +
- +
-=== Type I ===+
  
-Firstit provides the following interface.+Basicallythis interface are supposed to be used as arguments to one of the functions. In other words, the next() and next64() methods are not intended to be called directly. next64() returns an invalid value in a 32-bit environment.
  
 <code php> <code php>
Line 72: Line 141:
 interface RNGInterface interface RNGInterface
 { {
 +    /** Generates 32bit pseudo random number. */
     public function next(): int;     public function next(): int;
-    /** @throws ValueError */+ 
 +    /** Genrates 64bit pseudo random nunber */
     public function next64(): int;     public function next64(): int;
 } }
 </code> </code>
  
-The next64() method throws a ValueError exception when running on a non-64bit architecture or not supported RNG class.+The next step is to implement the RNG classes that implement this interface. This RFC proposes implementations of XorShift128+, MT19937 (similar to mt_srand()), and OS provided (similar to random_bytes()).
  
-Nextdefine an interface that provides methods equivalent to PHP functions that use the output of the RNG.+These implementations are done in C and are fast. Howeverby implementing RNGInterface, we can do the same thing in PHP code.
  
 <code php> <code php>
 namespace RNG; namespace RNG;
  
-interface RandomInterface extends RNGInterface+class XorShift128Plus implements RNGInterface // Fast modern PRNG.
 { {
-    public function arrayShuffle(array &$array): bool; // substitute array_shuffle() function. +    public function __construct(int $seed{} 
-    public function stringShuffle(string $string): string; // substitute str_shuffle() function. +    public function next(): int {} 
-    public function arrayRandom(array $array, int $num = 1): int|string|array; // substitute array_rand() function.+    public function next64(): int {} 
 +    public function __serialize(): array {} 
 +    public function __unserialize(array $data): void {}
 } }
-</code> 
  
-These methods are safe for the internal state of the RNG class.+class MT19937 implements RNGInterface // Completely consistent \mt_srand() and \mt_rand() implementation. 
 +
 +    public function __construct(int $seed) {} 
 +    public function next(): int {} 
 +    public function next64(): int {} 
 +    public function __serialize(): array {} 
 +    public function __unserialize(array $data): void {} 
 +}
  
-Finally, we provide an RNG class that implements these interfaces+class OS implements RNGInterface // // Cryptographically Secure PRNG
-At this time, considering support for XorShift128+ , MT19937 and OSRandom.+
 +    public function next(): int {} 
 +    public function next64(): int {} 
 +
 +</code>
  
-<code php> +Some RNG implementations can serialize state using the standard PHP serialization methods (serialize() and unserialize() function). This is useful for the purpose of storing state.
-namespace RNG;+
  
-class XorShift128Plus implements RandomInterface {} // use XorShift128+ algorithm +<code php> 
-class MT19937 implements RandomInterface {} // use MT19937 algorithm +$rng = new RNG\XorShift128Plus(1234); 
-class OSRNG implements RandomInterface {} // use php_random_bytes() internal API+  
 +\rng_next($rng)
 +  
 +$serialized_string = \serialize($rng); 
 +  
 +$rng2 = \unserialize($serialized_string); 
 +  
 +\rng_next($rng) === rng_next($rng2); // true
 </code> </code>
  
-This implementation is superior to the Type II implementation in the following ways+For existing functions that use RNGs, make changes to accept these classes as arguments. If an argument is specified, the function will use that RNG instead of the internal MT.
  
-  * Userland can provide an arbitrary RNGInterface / RandomInterface implementation. +<code php> 
-  * Providing methods based on inheritance is user-friendly. +function shuffle(array &$array, ?RNG\RNGInterface $rng = null): bool {}
-  * The implementation will be simple.+
  
-Howeverit is inferior in the following areas+function str_shuffle(string $string?RNG\RNGInterface $rng = null): string {}
  
-  * Implementation will be redundant. +function array_rand(array $array, int $num = 1, ?RNG\RNGInterface $rng = null): int|string|array {} 
-  * Inheritance patterns are not modern. +</code>
-  * Unable to recognize incorrect usage of existing functions.+
  
-=== Type II === +Finally, implement a function that can generate values from RNGs, such as the mt_rand(), random_bytes() function. If the $unsigned argument is false, it will return the value generated by the RNG as is.
- +
-First, it provides the following interface.+
  
 <code php> <code php>
-namespace RNG;+function rng_bytes(RNG\RNGInterface $rng, int $length): string {} // simlar random_bytes()
  
-interface RNGInterface +function rng_int(RNG\RNGInterface $rng, int $min, int $max): int {} // simlar mt_rand() with optional arguments
-+
-    public function next(): int+
-    /** @throws ValueError */ +
-    public function next64(): int; +
-+
-</code>+
  
-The next64() method throws a ValueError exception when running on a non-64bit architecture or not supported RNG class.+function rng_next(RNG\RNGInterface $rng, bool $unsigned = true): int {} // simlar mt_rand() without optional arguments
  
-Next, make some changes to the existing functions and add some new ones. +/** @throws ValueError */ 
- +function rng_next64(RNG\RNGInterface $rng, bool $unsigned = true): int {} /Generates 64bit random number
-<code php> +
-function shuffle(array &$array, ?RNGInterface $rng = null): bool {} +
-function str_shuffle(string $string, ?RNGInterface $rng = null): string {} +
-function array_rand(array $array, int $num = 1, ?RNGInterface $rng = null): int|string|array {} +
-/** Generates a random number for the specified range using RNG. */ +
-function rn_range(RNGInterface $rng, int $min, int $max): int [} +
-/** Generates a sequence of bytes for a specified range using RNG. */ +
-function rng_bytes(RNGInterface $rng, int $length): string {}+
 </code> </code>
  
-Existing RNG-specifying functions will now be able to explicitly specify the source of the RNG. This makes it clear that these functions are using the RNG internally.+The rng_next64() function will throw a ValueError exception if 64-bit integer is not available on the platform.
  
-Finally, we provide an RNG class that implements these interfaces. +This is for proper handling of next64() in a 32-bit environment and to improve interoperability with the mt_rand() function. The mt_rand() function performs a bit shift on the result and always returns an unsigned integer.
-At this time, considering support for XorShift128+ , MT19937 and OSRandom.+
  
 <code php> <code php>
-namespace RNG; +// If the second argument is true (default), rng_next() performs bit shifting like mt_rand() and always returns an unsigned integer. 
- +\mt_srand(1234); 
-final class XorShift128Plus implements RNGInterface {} // use XorShift128+ algorithm +$mt_state = new \RNG\MT19937(1234); 
-final class MT19937 implements RNGInterface {} // use MT19937 algorithm +\mt_rand() === \rng_next($mt_state); // true 
-final class OSRNG implements RNGInterface {} // use php_random_bytes() internal API+  
 +// If false, it will return the value as is. This is exactly the same result as $rng->next(). 
 +\mt_rand() === \rng_next($mt_state, false); // false 
 +  
 +// This is useful if you want to use the numbers generated by the RNG directly. 
 +\rng_next(new \RNG\MT19937(1234), false) === 822569775; // true
 </code> </code>
- 
-This implementation is superior to the Type II implementation in the following ways 
- 
-  * User will be able to notice in advance that the function uses RNG. 
-  * RandomInterface is not required. 
-  * Simplifies the internal implementation of the class. 
- 
-However, it is inferior in the following areas 
- 
-  * RNG instances as an argument confuses the user. 
-  * Implementations that handle userland RNGInterface implementations can be complex. 
-    * May not need to support this. 
  
 ===== Backward Incompatible Changes ===== ===== Backward Incompatible Changes =====
 +A new optional argument will be added to the existing arguments below. If you are using reflection for these functions, this may cause problems.
  
-With the provides of new classes, some class names (or namespaceswill no longer be available in userland.+  * shuffle() 
 +  * str_shuffle() 
 +  * array_rand()
  
 ===== Proposed PHP Version(s) ===== ===== Proposed PHP Version(s) =====
Line 185: Line 251:
  
 ==== To Existing Extensions ==== ==== To Existing Extensions ====
-orng [[https://pecl.php.net/package/orng]] : it is a PECL extension that provides almost the same functionality. If the interface is provided by the core in the future, it will need to be supported. And that's me.+none
  
 ==== To Opcache ==== ==== To Opcache ====
Line 197: Line 263:
  
 ===== Open Issues ===== ===== Open Issues =====
-=== Why implement a method that is almost identical to a traditional function? === +=== In other languages, what methods do you use to get around this problem? === 
-It is intended to improve interoperability with conventional code. Users can modify the implementation to be safe against state by simply replacing the existing code.+Similar to this proposal, a class-based implementation is in place.
  
-=== With JIT, won't the userland implementation reach a useful speed? === +  * C# https://docs.microsoft.com/en-us/dotnet/api/system.random 
-Comparing the speed of the userland implementation of XorShift128+ and the orng extension.+  * Java https://docs.oracle.com/javase/8/docs/api/java/util/Random.html
  
-**PHP 8.0** +===== Unaffected PHP Functionality ===== 
-<code shell> +It does not affect any related existing functions.
-$ time php -r 'require __DIR__ . "/vendor/savvot/random/src/AbstractRand.php"; require __DIR__ . "/vendor/savvot/random/src/XorShiftRand.php"; $r new Savvot\Random\XorShiftRand(1234); for ($i 0; $i < 1000000; $i++) { $r->random(); }'+
  
-real 0m0.441s 
-user 0m0.429s 
-sys 0m0.010s 
-</code> 
- 
-**PHP 8.0 + OPcache JIT** 
-<code shell> 
-$ time php -dopcache.jit_buffer_size=100M -dopcache.enable_cli=1 -r 'require __DIR__ . "/vendor/savvot/random/src/AbstractRand.php"; require __DIR__ . "/vendor/savvot/random/src/XorShiftRand.php"; $r = new Savvot\Random\XorShiftRand(1234); for ($i = 0; $i < 1000000; $i++) { $r->random(); }' 
- 
-real 0m0.155s 
-user 0m0.139s 
-sys 0m0.015s 
-</code> 
- 
-**PHP 8.0 + orng** 
-<code shell> 
-$ time php -r '$r = new \ORNG\XorShift128Plus(1234); for ($i = 0; $i < 1000000; $i++) { $r->next(); }' 
- 
-real 0m0.056s 
-user 0m0.048s 
-sys 0m0.008s 
-</code> 
- 
-This provides a significant improvement, but still slow from the C implementation. 
- 
-=== Why do we need this feature in the core and not in the extension? === 
-In order to use the features related to pseudo-random numbers that PHP currently provides, an understanding of the core is required. If this proposal is implemented, users will be able to use pseudo-random numbers under the easy to understand concept of objects. This is a useful improvement to the overall functionality of the language. 
- 
-===== Unaffected PHP Functionality ===== 
-It does not affect any related existing functions. (However, in the case of Type II, non-destructive arguments will be added.) 
   * mt_srand()   * mt_srand()
   * mt_rand()   * mt_rand()
-  * shuffle() 
-  * str_shuffle() 
-  * array_rand() 
  
-===== Proposed Voting Choices ===== +===== Vote ===== 
-Yes/No, requiring 2/3 majority+Voting opens 2021-04-01 and 2021-04-15 at 00:00:00 EDT. 2/3 required to accept.
  
-There are a few additional options for implementation. +<doodle title="Add object-scoped RNG" auth="zeriyoshi" voteType="single" closed="true"> 
- +   Yes 
-<doodle title="Which implementation looks good?" auth="user" voteType="single" closed="true"> +   No
-   Type I +
-   Type II +
-</doodle> +
- +
-<doodle title="To which namespace should these classes and interfaces belong?" auth="user" voteType="single" closed="true"> +
-   * Top Level (\RNGInterface) +
-   * "RNG" namespace (\RNG\RNGInterface) +
-   * "PHP\RNG" namespace (\PHP\RNG\RNGInterface)+
 </doodle> </doodle>
  
 ===== Patches and Tests ===== ===== Patches and Tests =====
-  * https://github.com/php/php-src/pull/6568 (Type I [outdated]) +  * https://github.com/php/php-src/pull/6568
-  * https://github.com/zeriyoshi/php-src/tree/object_scope_rng_type2 (Type II) +
- +
-===== References ===== +
-  * [1] https://github.com/savvot/random +
-  * [2] https://github.com/zeriyoshi/php-ext-orng +
-  * [3] https://docs.microsoft.com/en-us/dotnet/api/system.random +
-  * [4] https://docs.oracle.com/javase/8/docs/api/java/util/Random.html +
-  * [5] https://www.easyswoole.com/En/Other/random.html+
rfc/object_scope_prng.txt · Last modified: 2021/04/14 14:38 by zeriyoshi