rfc:rng_extension

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
rfc:rng_extension [2021/09/03 14:02]
zeriyoshi add more example
rfc:rng_extension [2022/08/01 16:52] (current)
timwolla Errata
Line 1: Line 1:
-====== PHP RFC: Random Extension 3.====== +====== PHP RFC: Random Extension 5.====== 
-  * Version: 3.0 +  * Version: 5.x 
-  * Date: 2021-09-02 +  * Date: 2022-02-24 
-  * Author: Go Kudo <zeriyoshi@gmail.com> +  * Author: Go Kudo <zeriyoshi@gmail.com> <g-kudo@colopl.co.jp
-  * Status: Under Discussion +  * Status: Implemented 
-  * Implementation: https://github.com/php/php-src/pull/7453+  * Implementation: https://github.com/php/php-src/pull/8094
   * First Published at: http://wiki.php.net/rfc/object_scope_prng   * First Published at: http://wiki.php.net/rfc/object_scope_prng
  
 ===== Introduction ===== ===== Introduction =====
  
-Currently, PHP's random number implementation suffers from several problems.+There are several problems with the current implementation of PHP's random functionality, so some proposed improvements.
  
-The first is that there are many different implementations. Historically, the random number implementations have been separated into lcg.c, rand.c, mt_rand.c random.c respectively, and the header file dependencies are complex.+==== Problems ====
  
-Second, the pseudo-random number generator makes use of global state. If random number is consumed at an unexpected time, the reproducibility of the result may be lostLook at the following example.+There are four main problems 
 + 
 +  * Global state 
 +  * Mersenne Twister 
 +  * Randomness 
 +  * Internals 
 + 
 +=== Global state === 
 + 
 +Mersenne Twister state is implicitly stored in global area of PHP, and there is no way for the user to access itso adding any randomization functions between the seeding and the intended usage would break the code. 
 + 
 +Let's say you have the following code.
  
 <code php> <code php>
-echo foo(1234, function (): void {}) . PHP_EOL; // Result: 1480009472 +<?php
-echo foo(1234, function (): void { mt_rand(); }) . PHP_EOL; // Result: 1747253290+
  
-function foo(int $seed, callable $bar): int +function foo(): void 
-    mt_srand($seed); +    // do nothing;
-    $result = mt_rand(); +
-    $bar(); +
-    $result += mt_rand(); +
-    return $result;+
 } }
 +
 +mt_srand(1234);
 +foo();
 +mt_rand(1, 100); // result: 76
 </code> </code>
  
-Reproducibility of random numbers can easily be lost if additional code is added later.+Then at some point in time the function was edited like below.
  
-In addition, the fiber extension was introduced in PHP 8.1. This makes it more difficult to keep track of the execution order. However, this problem has existed since the introduced of Generator.+<code php> 
 +<?php
  
-There is also the problem of functions that implicitly use the state stored in PHP's global state. shuffle()str_shuffle(), and array_rand() functions implicitly advance the state of a random number. This means that the following code is not reproducible, but it is difficult for the user to notice this.+function foo(): void { 
 +    str_shuffle('abc'); // added randomization 
 +}
  
-<code php> 
 mt_srand(1234); mt_srand(1234);
-echo mt_rand() . PHP_EOL; // Result: 411284887 +foo(); 
- +mt_rand(1, 100); // result65
-mt_srand(1234); +
-str_shuffle('foobar'); +
-echo mt_rand() . PHP_EOL; // Result1314500282+
 </code> </code>
  
-===== Proposal =====+As you can see, the result of mt_rand has changed from 76 to 65 because str_shuffle() changed the state of Mersenne Twister internally.
  
-Clean up the implementationseparate out the random number related functions as Random extension, and add an object scoped API.+Maintaining such code can be difficult when your code utilizes external packages. 
 +Also, by using Generator and Fiber introduced in PHP 8.1, the current state can be easily lost.
  
-All of the following functions will be moved to the newly created Random extension.+Given the above, mt_srand() and srand(), can not provide reproducible values in a consistent manner
  
-  * lcg_value() +Another problem which may occur is when using extensions like Swoole, which copy global random state to child processes due to its structure, making random number-related operations unsafe unless they are reseeded.
-  * srand() +
-  * rand() +
-  * mt_srand() +
-  * mt_rand() +
-  * random_int() +
-  * random_bytes()+
  
-At the same time, the following internal APIs will also be relocated. If you want to use them, you can simply include ext/random/random.h.+https://wiki.swoole.com/#/getting_started/notice?id=mt_rand%e9%9a%8f%e6%9c%ba%e6%95%b0
  
-  * php_random_int_throw() +=== Mersenne Twister ===
-  * php_random_int_silent() +
-  * php_combined_lcg() +
-  * php_mt_srand() +
-  * php_mt_rand() +
-  * php_mt_rand_range() +
-  * php_mt_rand_common() +
-  * php_srand() +
-  * php_rand() +
-  * php_random_bytes() +
-  * php_random_int()+
  
-The following PHP constants will now be provided by the Random extension+Mersenne Twister is an excellent pseudo random number generator. But, it is old and no longer suitable for the current needs.
  
-  * MT_RAND_MT19937 +It has a very long period of 2^19937 - 1. In general, a long period is a good thing, but nevertheless it fails several statistical tests (BigCrush and Crush).
-  * MT_RAND_PHP+
  
-To solve the scope problem, the following classes will be added+Also, the size that Mersenne Twister can generate is limited to 32-bit. This is not compatible with the current situation where many execution environments are 64-bit and zend_long has a length of 64-bit.
  
-  * Random class +=== Randomness ===
-  * Random\NumberGenrator abstract class +
-  * Random\NumberGenerator\XorShift128Plus class +
-  * Random\NumberGenerator\MT19937 class +
-  * Random\NumberGenerator\Secure class+
  
-The Random class is a utility class that provides functionality using random numbers. It provides the following methodsbut does not provide an alternative to array_rand because it is too complex.+PHP's built-in functions (<php>shuffle()</php>, <php>str_shuffle()</php>, <php>array_rand()</php>) use Mersenne Twister as the default random number source. This is inappropriate if you need cryptographically secure random numbers. If a similar function that meets that requirement is neededthe user will need to implement a new function using <php>random_int()</php> or similar functions.
  
-  * getInt() +=== Internals ===
-  * getBytes() +
-  * shuffleArray() +
-  * shuffleString()+
  
-This class can be used in the following way.+The implementation of random numbers in PHP is scattered within the standard module for historical reasons.
  
-<code php> +The following are different header filesbut some are interdependentwhich can be very confusing to extension developers.
-// functions +
-mt_srand(1234); +
-mt_rand(); // generate random number +
-mt_rand(110); // generate random number in range +
-str_shuffle("foobar"); // shuffle string +
-$arr = range(110); +
-shuffle($arr); // shuffle array items (pass by reference)+
  
-// object +|                            ^ extension ^ header              ^ source      ^ 
-$mt = new Random\NumberGenerator\MT19937(1234); +^ Combined LCG  | standard    | php_lcg.h           | lcg.c           | 
-$mt->generate(); // generate random number  +^ libc rand*           | standard    | php_rand.h        | rand.c        | 
-$random = new Random($mt); +^ MT19937          | standard    | php_mt_rand.h | mt_rand.c | 
-$random->getInt(1, 10); // generate random number in range +^ CSPRNG             | standard    | php_random.h   random.c   |
-$random->shuffleString("foobar"); // shuffle string +
-$random->shuffleArray(range(1, 10)); // shuffle array items (pass by value) +
-</code>+
  
-The Random class accepts an instance that inherits from Random\NumberGenerator as a constructor argument. 
  
-This class is final and cannot be cloned, but it can be serialized. +==== Userland approach ====
-This is to prevent $rng from being copied by reference to a property and causing unintended behavior.+
  
-The serializability depends on the serializability of the contained $rng.+Think about how the above problems could be solved in userland. 
 + 
 +Implement a random number generator in PHP. Here I will consider an already existing implementation (https://github.com/savvot/random) and our implementation of XorShift128+.
  
 <code php> <code php>
-final class Random+class XorShift128Plus
 { {
-    private Random\NumberGenerator $randomNumberGenerator+    /* constants */ 
- +    protected const MASK_S5 = 0x07ffffffffffffff; 
-    public function __construct(?Random\NumberGenerator $randomNumberGenerator = null) {} +    protected const MASK_S18 = 0x00003fffffffffff; 
-    public function getNumberGenerator(): Random\NumberGenerator {} +    protected const MASK_S27 = 0x0000001fffffffff; 
-    public function getInt(int $min, int $max): int {+    protected const MASK_S30 = 0x00000003ffffffff; 
-    public function getBytes(int $length): string {} +    protected const MASK_S31 = 0x00000001ffffffff; 
-    public function shuffleArray(array $array): array {} +    protected const MASK_LO = 0x00000000ffffffff; 
-    public function shuffleString(string $string): string {} +  
- +    protected const ADD_HI = 0x9e3779b9; 
-    public function __serialize(): array {} +    protected const ADD_LO = 0x7f4a7c15; 
-    public function __unserialize(array $data): void {}+    protected const MUL1_HILO = 0x476d; 
 +    protected const MUL1_HIHI = 0xbf58; 
 +    protected const MUL1_LO = 0x1ce4e5b9; 
 +    protected const MUL2_HIHI = 0x94d0; 
 +    protected const MUL2_HILO = 0x49bb; 
 +    protected const MUL2_LO = 0x133111eb; 
 +  
 +    /* states */ 
 +    protected int $s0; 
 +    protected int $s1
 +  
 +    public function __construct(int $seed) 
 +    { 
 +        $s = $seed; 
 +        $this->s0 = $this->splitmix64($s); 
 +        $this->s1 = $this->splitmix64($s); 
 +    } 
 +  
 +    public function generate(): int 
 +    
 +        $s1 = $this->s0; 
 +        $s0 = $this->s1; 
 +  
 +        $s0h = ($s0 >> 32) & self::MASK_LO; 
 +        $s0l = $s0 & self::MASK_LO; 
 +        $s1h = ($s1 >> 32& self::MASK_LO; 
 +        $s1l = $s1 & self::MASK_LO; 
 +        $zl = $s0l + $s1l; 
 +        $zh = $s0h + $s1h + ($zl >> 32); 
 +        $z = ($zh << 32) | ($zl & self::MASK_LO); 
 +  
 +        $this->s0 = $s0; 
 +        $s1 ^= $s1 << 23; 
 +        $this->s1 = $s1 ^ $s0 ^ (($s1 >> 18) & self::MASK_S18) ^ (($s0 >> 5) & self::MASK_S5); 
 +  
 +        return $z; 
 +    } 
 +  
 +    protected function splitmix64(int &$s): int 
 +    
 +        $zl = $s & self::MASK_LO; 
 +        $zh = ($s >> 32& self::MASK_LO; 
 +        $carry = $zl + self::ADD_LO; 
 +        $z = $s = (($zh + self::ADD_HI + ($carry >> 32)) << 32) | ($carry & self::MASK_LO); 
 +  
 +        $z ^= ($z >> 30& self::MASK_S30; 
 +        $zl = $z & self::MASK_LO; 
 +        $zh = ($z >> 32& self::MASK_LO; 
 +        $lo = self::MUL1_LO * $zl; 
 +        $zll = $zl & 0xffff; 
 +        $zlh = $zl >> 16; 
 +        $mul1l = $zll * self::MUL1_HILO; 
 +        $mul1h = $zll * self::MUL1_HIHI + $zlh * self::MUL1_HILO + (($mul1l >> 16) & 0xffff); 
 +        $mul1 = (($mul1h & 0xffff) << 16) | ($mul1l & 0xffff); 
 +        $mul2 = ((self::MUL1_LO * $zh) & self::MASK_LO); 
 +        $carry = (($lo >> 32) & self::MASK_LO); 
 +        $hi = $mul1 + $mul2 + $carry; 
 +        $z = ($hi << 32) | ($lo & self::MASK_LO); 
 +  
 +        $z ^= ($z >> 27) & self::MASK_S27; 
 +        $zl = $z & self::MASK_LO; 
 +        $zh = ($z >> 32) & self::MASK_LO; 
 +        $lo = self::MUL2_LO * $zl; 
 +  
 +        $zll = $zl & 0xffff; 
 +        $zlh = $zl >> 16; 
 +        $mul1l = $zll * self::MUL2_HILO; 
 +        $mul1h = $zll * self::MUL2_HIHI + $zlh * self::MUL2_HILO + (($mul1l >> 16) & 0xffff); 
 +        $mul1 = (($mul1h & 0xffff) << 16) | ($mul1l & 0xffff); 
 +  
 +        $mul2 = (self::MUL2_LO * $zh) & self::MASK_LO; 
 +        $carry = ($lo >> 32) & self::MASK_LO; 
 +        $hi = $mul1 + $mul2 + $carry; 
 +        $z = ($hi << 32) | ($lo & self::MASK_LO); 
 +  
 +        return $z ^ (($z >> 31) & self::MASK_S31); 
 +    } 
 +
 +  
 +$xs128pp = new \XorShift128Plus(1234); 
 +  
 +// Benchmarking 
 +for ($i = 0; $i < 1000000000; $i++) { 
 +    $xs128pp->generate();
 } }
 </code> </code>
  
-The Random\NumberGenerator abstract class has a single abstract method called generate(). +Compare the speed of these implementations with the PHP's mt_rand(). 
 + 
 +|                                ^ PHP - XorShift128+ (iter:1000000000) ^ PHP - MtRand (savvot/random) (iter: 10000000) ^ Native - MT (iter: 10000000) ^ 
 +^ PHP 8.1                | 0m3.218s                                            | 0m4.161s                                                           | 0m0.160s                                 | 
 +^ PHP 8.1 with JIT  | 0m1.836s (64M buffer)                    | 0m2.184s (64M buffer)                                    | 0m0.184s (64M buffer)         | 
 + 
 +Native implementation is much faster than userland ones, even with JIT enabled. 
 + 
 +More about this can be read here: https://externals.io/message/115918#115959 
 + 
 +===== Proposal ===== 
 + 
 +Create a single Randomizer class which provides various randomization methods (like get int/bytes, shuffle string/arrays). This class will take an Engine interface in the constructor which can be swapped based on users needs. Some essential RNG engines will be prepackaged for convenience but an Interface will also be provided so that algorithms can be easily added. 
 + 
 +I believe this proposal has the following benefits. 
 + 
 +=== Swapping RNG Based on Environment === 
 + 
 +The appropriate RNG can be selected depending on the environment. 
 + 
 +For example, say you want to use PRNG with a seed in development, but would like to use CSPRNG in production. This would be easily achievable with the following code.
  
 <code php> <code php>
-namespace Random;+$rng = $is_production 
 +    ? new Random\Engine\Secure() 
 +    : new Random\Engine\PCG64(1234); 
 +  
 +$randomizer = new Random\Randomizer($rng); 
 +$randomizer->shuffleString('foobar'); 
 +</code>
  
-abstract class NumberGenerator +=== Fixed Random Number Sequence === 
-{ + 
-        abstract public function generate(): int {}+Processes that continue to generate random numbers until certain requirements are met may make it difficult to measure the processing load. 
 + 
 +<code php> 
 +$required_result = mt_rand(1, 100); 
 +while (($generated = mt_rand(1, 100)) !== $required_result) { 
 +    echo "retry\n";
 } }
 +
 +echo "done\n";
 </code> </code>
  
-By defining a class that extends Random\NumberGenerator, the user can use their own random number generator. With the introduction of JIT in PHP 8.0, this can generate random numbers at a realistic speed.+Interface and dynamic injectionsallowing for the fixed sequences at test time.
  
 <code php> <code php>
-class UserDefinedRNG extends Random\NumberGenerator +$engine = new class () implements Random\Engine 
-{ +    public function generate(): string
-    protected int $current = 0; +
- +
-    public function generate(): int+
     {     {
-        return ++$this->current;+        // Result must be a string. 
 +        return pack('V', 1);
     }     }
-}+}
 +$randomizer = new Random\Randomizer($engine);
  
-function foobar(Random\NumberGenerator $numberGenerator): void { +$required_result = $randomizer->getInt(1, 100); 
-    for ($0; $i < 9; $i++) { +while (($generated = $randomizer->getInt(1, 100)) !== $required_result) { 
-        echo $numberGenerator->generate(); +    echo "retry\n";
-    }+
 } }
  
-foobar(new UserDefinedRNG())// Results: 123456789+echo "done\n";
 </code> </code>
  
-It is also useful when you want to use a random number sequence with a fixed result, such as in testing.+=== Cryptographically Secure Random Operations ===
  
-The Random class creates and uses an instance of the default random number generator, Random\NumberGenerator\XorShift128Plus, if the constructor argument is omitted.+Shuffling strings and arrays using CSPRNG (or any other RNG besides Mersenne Twister) was only achievable by implementing it in userland. This can now be done without writing userland code.
  
-XorShift128Plus is an efficient, high-quality algorithm used in modern browsers and other applications. This algorithm is capable of generating a wider range of random numbers in a 64-bit environment. In a 32-bit environment, the range beyond zend_long will simply be truncated. This indicates incompatibility between environments, but is acceptable for real-world use.+<code php> 
 +$engine = new Random\Engine\Secure(); 
 +$randomizer = new Random\Randomizer($engine);
  
-The Random\NumberGenerator\MT19937 classwhich implements the MT19937 Mersenne twisteris also provided for backward compatibility or when higher period is required. However, a 1-bit right shift is required to obtain exactly the same result as mt_rand(), as shown belowThis is due to historical reasons.+$items = range(1, 10); 
 +$items = $randomizer->shuffleArray($items); 
 +</code> 
 + 
 +=== State safe === 
 + 
 +Since the scope is limited to the engine instance, unintentional state changes caused by things such as external packages and Fiber are completely prevented.  
 + 
 +==== Approach ==== 
 + 
 +Implement the following new interfaces and classes. 
 + 
 +=== interface Random\Engine === 
 + 
 +Interface to provide random number generator engine. 
 + 
 +It has a single <php>generate(): string</php> method that generates random numbers as a binary string. This string must be non-empty and attempting to return an empty will result in a RuntimeException.  
 + 
 +If you implement a random number generator in PHP, the generated numbers must be converted to binary using the <php>pack()</php> functionand the values must be little-endian. 
 + 
 +Engine::generate() always returns the result as string, so it is independent of the bit and endianness in the execution environment and always returns the same result for the same seed and sequence. 
 + 
 +However, if string of 64-bit or greater is returned, it may be truncated for internal processing reasons. This currently applies only to user-defined classes. 
 + 
 +=== interface Random\CryptoSafeEngine === 
 + 
 +A marker interface to indicate that the implemented random number generator is cryptographically secure. 
 + 
 +=== interface Random\SerializableEngine === 
 + 
 +An interface indicating that the implemented random number generator is serializable. 
 + 
 +The following methods must be implemented: 
 + 
 +  * <php>__serialize(): array</php> 
 +  * <php>__unserialize(array $data): void</php> 
 + 
 +=== class Random\Engine\CombinedLCG === 
 + 
 +Generate random numbers using the CombinedLCG algorithm. 
 + 
 +By passing a value to the constructorit can be seeded with any valueIf omitted or null, the seed value is generated by CSPRNG. 
 + 
 +The following interfaces are implemented: 
 + 
 +  * <php>Random\Engine</php> 
 +  * <php>Random\SerializableEngine</php> 
 + 
 +The following methods are implemented: 
 + 
 +  * <php>__construct(int|null $seed = null)</php> 
 +  * <php>generate(): string</php> 
 +  * <php>__serialize(): array</php> 
 +  * <php>__unserialize(array $data): void</php> 
 + 
 +The values generated by calling CombinedLCG::generate() with the same seed will always return the same sequence of results, e.g.:
  
 <code php> <code php>
 $seed = 1234; $seed = 1234;
  
-$mt = new Random\NumberGenerator\MT19937($seed); +$engine = new \Random\Engine\CombinedLCG($seed); 
-mt_srand($seed); +var_dump(bin2hex($engine->generate())); // "fc6ff102" 
-var_dump(mt_rand() === ($mt->generate() >> 1)); // true+var_dump(bin2hex($engine->generate())); // "40e0ce05" 
 + 
 +// same seed results in same sequence of results. 
 +$engine new \Random\Engine\CombinedLCG($seed); 
 +var_dump(bin2hex($engine->generate())); // "fc6ff102" 
 +var_dump(bin2hex($engine->generate())); // "40e0ce05"
 </code> </code>
  
-The following NumberGenerator class supports serialization. Secure is not serializable because it uses random_bytes internally and has no state.+=== class Random\Engine\MersenneTwister ==
  
-  * Random\NumberGenerator\XorShift128Plus +Generate random numbers using the MT19937 (a.k.a Mersenne Twister) algorithm.
-  * Random\NumberGenerator\MT19937 +
-  * Random\NumberGenerator extends user-defined classes.+
  
-Also, new internal API will be implemented.+By passing value to the constructor, it can be seeded with any value. If omitted or null, the seed value is generated by CSPRNG. 
 +The second argument, passing MT_RAND_PHP, allows the use of PHP's broken Mersenne Twister.
  
-  * php_random_ng_next() +The following interfaces are implemented:
-  * php_random_ng_range() +
-  * php_random_ng_array_data_shuffle() +
-  * php_random_ng_string_shuffle()+
  
-A Stub showing these implementations can be found on the Pull-Request. It's probably easier to understand if you look at it.+  * <php>Random\Engine</php> 
 +  * <php>Random\SerializableEngine</php>
  
-  * [[https://github.com/php/php-src/blob/7a4ef6ccfbf4a2cd48a4f261f2911ebb7b057d46/ext/random/random.stub.php|random.stub.php]]+The following methods are implemented:
  
-===== Future Scope =====+  * <php>__construct(int|null $seed null, int $mode MT_RAND_MT19937)</php> 
 +  * <php>generate(): string</php> 
 +  * <php>__serialize(): array</php> 
 +  * <php>__unserialize(array $data): void</php>
  
-This proposal is just a first step to improve the situation of PHP's random number implementation.+The values generated by calling MersenneTwister::generate() with the same seed will always return the same sequence of results, e.g.:
  
-If this proposal is approved, I will then propose the following changes+<code php> 
 +$seed = 1234;
  
-  * Replace the state of the existing implementation with php_random_ng. +$engine = new \Random\Engine\MersenneTwister($seed)
-  * Replace random_bytes() with random_bytes() for random numbers used in shuffle(), str_shuffle(), and array_rand(). +var_dump(bin2hex($engine->generate())); // "2f6b0731" 
-  * Deprecate srand() and mt_srand() (step by step)+var_dump(bin2hex($engine->generate())); // "d3e2667f"
  
-===== Backward Incompatible Changes =====+// same seed results in same sequence of results. 
 +$engine new \Random\Engine\MersenneTwister($seed); 
 +var_dump(bin2hex($engine->generate())); // "2f6b0731" 
 +var_dump(bin2hex($engine->generate())); // "d3e2667f" 
 +</code>
  
-The code that includes the following header file needs to be changed to ext/random/random.h+=== class Random\Engine\PCG64 ===
  
-  * ext/standard/lcg.h +Generate random numbers using the PCG64 (Permuted Congruential Generator, pcg_oneseq_128) algorithm.
-  * ext/standard/rand.h +
-  * ext/standard/mt_rand.h +
-  * ext/standard/random.h+
  
-The following class names have been reserved and will no longer be available+By passing a value to the constructor, it can be seeded with any value. If omitted or null, the seed value is generated by CSPRNG. 
 +A string can also be passed as the seed value, and a string is required to seed with 64-bit or higher values. The string must be 128-bit.
  
-  "Random" +The following interfaces are implemented: 
-  * "Random\NumberGenerator" + 
-  * "Random\NumberGenerator\XorShift128Plus+  <php>Random\Engine</php> 
-  * "Random\NumberGenerator\MT19937" +  * <php>Random\SerializableEngine</php> 
-  * "Random\NumberGenerator\Secure"+ 
 +The following methods are implemented: 
 + 
 +  * <php>__construct(string|int|null $seed = null)</php> 
 +  * <php>generate(): string</php> 
 +  * <php>jump(int $advance): void</php> 
 +  * <php>__serialize(): array</php> 
 +  * <php>__unserialize(array $data): void</php> 
 + 
 +PCG64::jump() can be used to advance the state an arbitrary number of times. 
 + 
 +The values generated by calling PCG64::generate() with the same seed will always return the same sequence of results, e.g.: 
 + 
 +<code php> 
 +$seed = 1234; 
 + 
 +$engine = new \Random\Engine\PCG64($seed); 
 +var_dump(bin2hex($engine->generate())); // "ecfbe5990a319380" 
 +var_dump(bin2hex($engine->generate())); // "4f6b4a5b53b10e3f" 
 + 
 +// same seed results in same sequence of results. 
 +$engine = new \Random\Engine\PCG64($seed); 
 +var_dump(bin2hex($engine->generate())); // "ecfbe5990a319380" 
 +var_dump(bin2hex($engine->generate())); // "4f6b4a5b53b10e3f" 
 +</code> 
 + 
 +=== class Random\Engine\Secure === 
 + 
 +Secure::generate() cannot be seeded, and are non-reproducible because they are based on lower-layer true random numbers. 
 + 
 +Random number generated by this class is guaranteed to be CSPRNG. 
 + 
 +The following interfaces are implemented: 
 + 
 +  * <php>Random\Engine</php> 
 +  * <php>Random\CryptoSafeEngine</php> 
 + 
 +The following methods are implemented: 
 + 
 +  * <php>__construct()</php> 
 +  * <php>generate(): string</php> 
 + 
 +The sequence generated by Secure::generate() is not reproducible in any way. 
 + 
 +=== final class Random\Randomizer === 
 + 
 +A single class for processing with random numbers using the engine. 
 + 
 +The following methods are implemented: 
 + 
 +  * <php>__construct(Random\Engine $engine = new Random\Engine\Secure())</php> 
 +  * <php>getInt(): int // replaces mt_rand()</php> 
 +  * <php>getInt(int $min, int $max) // replaces mt_rand() and random_int()</php> 
 +  * <php>getBytes(int length): string // replaces random_bytes()</php> 
 +  * <php>shuffleArray(array $array): array // replaces shuffle()</php> 
 +  * <php>shuffleString(string $string): string // replaces str_shuffle()</php> 
 +  * <php>__serialize(): array</php> 
 +  * <php>__unserialize(array $data): void</php> 
 + 
 +The engine in the constructor is optional, and if it is omitted, Secure is automatically used. 
 + 
 +This class is serializable, but will fail if the given Engine is not serializable. 
 + 
 +The Randomizer calls the $this->engine->generate() method to generate values. But, Engine implemented by native is not limited to this and uses faster methods to generate values from the Engine. 
 + 
 +Randomizer methods are fully reproducible in any environment as long as the result of Engine::generate() is consistent. This means that it is independent of the endianness and bit depth of the execution environment, guaranteeing future compatibility. 
 + 
 +However, if Randomizer::getInt() with no arguments is executed in a 32-bit environment using an Engine that generates values above 32-bit, a RuntimeException will be thrown to avoid incompatibility. 
 + 
 +The engines implement a specific well-defined random number generator. 
 +For a given seed it is guaranteed that they return the same sequence as the reference implementation. 
 +For the Randomizer it is considered a breaking change if the observable behavior of the methods changes.  
 +For a given seeded engine and identical method parameters the following must hold: 
 + 
 +  * The number of calls to the Engine::generate() method remains the same. 
 +  * The return value remains the same for a given result retrieved from Engine::generate(). 
 + 
 +Any changes to the Randomizer that violate these guarantees require a separate RFC. 
 + 
 +===== PRNG shootout ===== 
 + 
 +Since MT19937 has the aforementioned problems, an alternative algorithm must be chosen. 
 + 
 +When introducing a new RNG algorithm, the selection of the algorithm is very important. The following table shows the RNG algorithms that I considered and their characteristics. 
 + 
 +|                                      ^ Generate size  ^ State size     ^ Performance                          ^ Issues                                   ^ Implemented applications   ^ 
 +^ MT19937                   | 32-bit                | 32-bit x 624  | Normal                                    | Some statistical test failed  | PHP, Python, Ruby                 | 
 +^ XorShift128+             | 64-bit                | 64-bit x 2       | Excellent                                | Failure BigCrush                     | V8, SpiderMonkey, JavaScriptCore | 
 +^ Xoshiro256++            | 64-bit                | 64-bit x 4       | Excellent, SIMD-frendly        | Currently none                       | Rust                                        | 
 +^ Xoshiro256*             | 64-bit                | 64-bit x 4       | Excellent                                | Currently none                       | Rust, .NET 6.0                       | 
 +^ PCG64 (XSL-RR)        | 64-bit                 | 128-bit x 2    | Good                                       | Currently none                       | Rust, NumPy                          | 
 + 
 +MT19937 and XorShift128+ are already widely used, but they have failed several statistical tests and are not recommended for new use. 
 +So I adopted a more modern PRNG called PCG64 which does not have any statistical test problems. 
 + 
 +PCG64 is the only implementation of a reproducible RNG, except for MT19937 for compatibility and the special uses User and Secure. 
 + 
 +PCG64 (pcg_state_oneseq_128 XSL-RR) looked like a good fit since it uses 64-bit wide values. 
 +PCG64 uses 128-bit integers, which cannot be used natively in 32-bit environments and needs to be emulated,  
 +but I think this is not a problem since most environments are now using 64-bit architectures. 
 + 
 +I also considered Xoshiro256** but chose PCG because all of the issues raised against PCG64 appeared to have been resolved appropriately. 
 +The issues raised and the inventor's discussion of them can be found at 
 + 
 +  * https://pcg.di.unimi.it/pcg.php 
 +  * https://www.pcg-random.org/posts/on-vignas-pcg-critique.html 
 + 
 +It is interesting to note that these algorithms have been heavily criticized by each other. 
 +Both opinions were respectable, which made the selection process very difficult. 
 + 
 +I considered implementing both but adding unnecessary choices would have caused confusion for the users so the idea was dropped. If anyone thinks one is needed it can be added through PHP extensions. 
 + 
 +===== Internal Changes ===== 
 + 
 +As a side effect of this RFC, the following PHP functions have been moved to the new ext/random extension. 
 + 
 +  * lcg_value() 
 +  * srand() 
 +  * rand() 
 +  * mt_srand() 
 +  * mt_rand() 
 +  * random_int() 
 +  * random_bytes() 
 + 
 +The following internal APIs will also be moved to the ext/random extension: 
 + 
 +  * php_random_int_throw() 
 +  * php_random_int_silent() 
 +  * php_combined_lcg() 
 +  * php_mt_srand() 
 +  * php_mt_rand() 
 +  * php_mt_rand_range() 
 +  * php_mt_rand_common() 
 +  * php_srand() 
 +  * php_rand() 
 +  * php_random_bytes() 
 +  * php_random_int() 
 + 
 +This is because ext/standard/random.c reserves the name RANDOM and cannot be used by the extension. In addition, all RNG-related implementations will be moved to the new random extension in order to standardize the RNG implementation. 
 + 
 +The following header files are left in for extension compatibility.  
 + 
 +  * ext/standard/php_lcg.h 
 +  * ext/standard/php_rand.h 
 +  * ext/standard/php_mt_rand.h 
 +  * ext/standard/php_random.h 
 + 
 +The contents all include ext/random/php_random.h. 
 + 
 +<code c> 
 +#include "ext/random/php_random.h" 
 +</code> 
 + 
 +===== Future Scope ===== 
 + 
 +These are not within the scope of this RFC, but are worth considering in the future: 
 + 
 +  * Remove old header files for compatibility (php_lcg.h, php_rand.h, php_mt_rand.h, php_random.h) 
 +  * Deprecate lcg_value(), mt_srand(), srand() 
 + 
 +===== Backward Incompatible Changes ===== 
 + 
 +The following names have been reserved and will no longer be available 
 + 
 +  * Random 
 +  * Random\Engine 
 +  * Random\CryptoSafeEngine 
 +  * Random\SerializableEngine 
 +  * Random\Engine\CombinedLCG 
 +  * Random\Engine\MersenneTwister 
 +  * Random\Engine\PCG64 
 +  * Random\Engine\Secure 
 +  * Random\Randomizer
  
 ===== Proposed PHP Version(s) ===== ===== Proposed PHP Version(s) =====
Line 236: Line 563:
  
 ==== To Existing Extensions ==== ==== To Existing Extensions ====
-none+In the future, it may be necessary to change the included header files to point to ext/random/php_random.h. However, compatibility will be maintained for now.
  
 ==== To Opcache ==== ==== To Opcache ====
Line 251: Line 578:
  
 ===== Vote ===== ===== Vote =====
-Voting opens 2021-MM-DD and 2021-MM-DD at 00:00:00 EDT. 2/3 required to accept.+Voting opens 2022-06-14 and 2022-06-28 at 00:00:00 UTC. 2/3 required to accept.
  
-<doodle title="Add Random class" auth="zeriyoshi" voteType="single" closed="true"> +<doodle title="Add Random extension" auth="zeriyoshi" voteType="single" closed="true">
    * Yes    * Yes
    * No    * No
Line 259: Line 586:
  
 ===== Patches and Tests ===== ===== Patches and Tests =====
-  * https://github.com/php/php-src/pull/7453+  * https://github.com/php/php-src/pull/8094 
 + 
 +===== Errata ===== 
 + 
 +==== Follow Up RFC: Random Extension Improvement ==== 
 + 
 +The [[rfc:random_extension_improvement|Random Extension Improvement]] follow-up RFC made several adjustments before the initial implementation was merged.  
 + 
 +==== Split of Randomizer::getInt() into ::getInt() and ::nextInt() ==== 
 + 
 +The parameter-less variant of <php>Randomizer::getInt()</php> was split into <php>Randomizer::nextInt()</php>. See pull request GH-9057: https://github.com/php/php-src/pull/9057
rfc/rng_extension.1630677749.txt.gz · Last modified: 2021/09/03 14:02 by zeriyoshi