rfc:rng_extension

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
rfc:rng_extension [2021/09/03 14:28]
zeriyoshi
rfc:rng_extension [2022/02/15 11:37]
zeriyoshi add stub, user-land limitations, open issues
Line 1: Line 1:
-====== PHP RFC: Random Extension 3.0 ====== +====== PHP RFC: Random Extension 4.0 ====== 
-  * Version: 3.0 +  * Version: 4.0.1 
-  * Date: 2021-09-02 +  * Date: 2022-02-14 
-  * Author: Go Kudo <zeriyoshi@gmail.com>+  * Author: Go Kudo <zeriyoshi@gmail.com> <g-kudo@colopl.co.jp>
   * Status: Under Discussion   * Status: Under Discussion
-  * Implementation: https://github.com/php/php-src/pull/7453+  * Implementation: https://github.com/php/php-src/pull/8094
   * First Published at: http://wiki.php.net/rfc/object_scope_prng   * First Published at: http://wiki.php.net/rfc/object_scope_prng
  
 ===== Introduction ===== ===== Introduction =====
  
-Currently, PHP's random number implementation suffers from several problems.+PHP implements several useful RNGs. However, they are currently only available in the global scope.
  
-The first is that there are many different implementations. Historicallythe random number implementations have been separated into lcg.crand.cmt_rand.c random.c respectivelyand the header file dependencies are complex.+Mersenne TwisterPHP's default RNGprovides a function mt_srand() to initialize with a user-specified seed valuebut the scope is globalwhich may cause unintended user behavior.
  
-Secondthe pseudo-random number generator makes use of global state. If a random number is consumed at an unexpected time, the reproducibility of the result may be lost. Look at the following example.+When a user executes mt_srand()one would expect it to only affect result of mt_rand(), however, the following functions implicitly affect the result of mt_rand()
  
-<code php> +  * shuffle() 
-echo foo(1234, function (): void {}) . PHP_EOL; // Result: 1480009472 +  * str_shuffle() 
-echo foo(1234, function (): void { mt_rand(); }) . PHP_EOL; // Result: 1747253290+  * array_rand()
  
-function foo(int $seedcallable $bar): int { +For examplein the following code, the result of the second mt_rand() is not reproducible. This is because shuffle() uses a MT RNG internallywhich changes the state.
-    mt_srand($seed); +
-    $result mt_rand()+
-    $bar(); +
-    $result += mt_rand(); +
-    return $result; +
-+
-</code> +
- +
-Reproducibility of random numbers can easily be lost if additional code is added later. +
- +
-In addition, the fiber extension was introduced in PHP 8.1. This makes it more difficult to keep track of the execution order. However, this problem has existed since the introduced of Generator. +
- +
-There is also the problem of functions that implicitly use the state stored in PHP's global state. shuffle(), str_shuffle(), and array_rand() functions implicitly advance the state of a random number. This means that the following code is not reproducible, but it is difficult for the user to notice this.+
  
 <code php> <code php>
 mt_srand(1234); mt_srand(1234);
-echo mt_rand() . PHP_EOL// Result: 411284887+$next = mt_rand();
  
 mt_srand(1234); mt_srand(1234);
-str_shuffle('foobar'); +$arr = range(0, 9); 
-echo mt_rand() . PHP_EOL; // Result1314500282+shuffle($arr); 
 +$next2 = mt_rand()
 + 
 +die("next: ${next}, next2: ${next2}"); // next: 411284887, next21848274264
 </code> </code>
  
-===== Proposal =====+These behaviors were unintuitive and often led to unintended execution results, but were not that problematic for general web application use.
  
-Clean up the implementationseparate out the random number related functions as Random extensionand add an object scoped API.+Howeverin more complex and repeatable applications (such as games)this can be a problem.
  
-All of the following functions will be moved to the newly created Random extension.+There is also the issue of state management difficulties with Fiber, which was added in PHP 8.1. Nikita had this to say:
  
-  * lcg_value() +https://externals.io/message/115918#115959
-  * srand() +
-  * rand() +
-  * mt_srand() +
-  * mt_rand() +
-  * random_int() +
-  * random_bytes()+
  
-At the same time, the following internal APIs will also be relocatedIf you want to use themyou can simply include ext/random/random.h.+In addition, the Mersenne Twistercan only generate 32-bit values. 
 +In recent years, many of the environments where PHP runs have been migrating to 64-bit platforms. 
 +In order to generate more secure valuesan RNG that can generate 64-bit wide values should be provided by the language.
  
-  * php_random_int_throw() +===== Proposal =====
-  * php_random_int_silent() +
-  * php_combined_lcg() +
-  * php_mt_srand() +
-  * php_mt_rand() +
-  * php_mt_rand_range() +
-  * php_mt_rand_common() +
-  * php_srand() +
-  * php_rand() +
-  * php_random_bytes() +
-  * php_random_int()+
  
-The following PHP constants will now be provided by the Random extension+Implement the XorShift128Plus algorithm for generating new 64-bit wide random numbers, along with a random extension that includes an object scope RNG, and bundle it with PHP. 
 +XorShift128Plus is a fast, high-quality RNG that is proven in major web browsers.  
 +Many of the major hardware architectures are now 64-bit, so it makes sense to use this RNG.
  
-  * MT_RAND_MT19937 +In addition to the new algorithm, the following classes will be added to fix the global scope issue.
-  * MT_RAND_PHP+
  
-To solve the scope problem, the following classes will be added+  * class Random\NumberGenerator\XorShift128Plus 
 +  * class Random\NumberGenerator\MersenneTwister 
 +  * class Random\NumberGenerator\CombinedLCG 
 +  * class Random\NumberGenerator\Secure (aka php_random_bytes())
  
-  * Random class +These classes will hold independent RNG state and will not affect the global scope.
-  * Random\NumberGenrator abstract class +
-  * Random\NumberGenerator\XorShift128Plus class +
-  * Random\NumberGenerator\MT19937 class +
-  * Random\NumberGenerator\Secure class+
  
-The Random class is a utility class that provides functionality using random numbers. It provides the following methods, but does not provide an alternative to array_rand because it is too complex.+RNGs other than XorShift128Plus are based on the RNGs currently implemented in PHP.
  
-  * getInt() +An interface Random\NumberGenerator is also added and are implmeneted by the classes above. 
-  * getBytes() +
-  * shuffleArray() +
-  * shuffleString()+
  
-This class can be used in the following way.+This interface has only a single generate() method which makes it possible to switch between RNG implementations depending on the situation, 
 +allowing alternative implementations to be done by PHP in userland. This is useful, for example, running tests.
  
 <code php> <code php>
-// functions +final class FixedForUnitTest implements \Random\NumberGenerator
-mt_srand(1234); +
-mt_rand(); // generate random number +
-mt_rand(1, 10); // generate random number in range +
-str_shuffle("foobar"); // shuffle string +
-$arr = range(1, 10); +
-shuffle($arr); // shuffle array items (pass by reference) +
- +
-// object +
-$mt = new Random\NumberGenerator\MT19937(1234); +
-$mt->generate(); // generate random number  +
-$random = new Random($mt); +
-$random->getInt(1, 10); // generate random number in range +
-$random->shuffleString("foobar"); // shuffle string +
-$random->shuffleArray(range(1, 10)); // shuffle array items (pass by value) +
-</code> +
- +
-The Random class accepts an instance that inherits from Random\NumberGenerator as a constructor argument. +
- +
-This class is final and cannot be cloned, but it can be serialized. +
-This is to prevent $rng from being copied by reference to a property and causing unintended behavior. +
- +
-The serializability depends on the serializability of the contained $rng. +
- +
-<code php> +
-final class Random+
 { {
-    private Random\NumberGenerator $randomNumberGenerator; +    private int $count = 0
- +     
-    public function __construct(?Random\NumberGenerator $randomNumberGenerator = null) {} +    public function generate(): int 
-    public function getNumberGenerator(): Random\NumberGenerator {} +    { 
-    public function getInt(int $min, int $max): int {} +        return ++$this->count; 
-    public function getBytes(int $length): string {} +     }
-    public function shuffleArray(array $array): array {} +
-    public function shuffleString(string $string): string {} +
- +
-    public function __serialize(): array {} +
-    public function __unserialize(array $data): void {}+
 } }
 </code> </code>
  
-The Random\NumberGenerator abstract class has single abstract method called generate()+However, the width of the random number that can be generated by userland implementation depends on the size of the int in PHP on that platform. This means that you can only generate up to 32 bits in a 32-bit environment, and up to 64 bits in a 64-bit environment. This likewise means that 32-bit RNGs cannot be implemented in userland in a 64-bit environment.
  
 <code php> <code php>
-namespace Random;+// on 64-bit machine
  
-abstract class NumberGenerator+final class UserMersenneTwister extends \Random\NumberGenerator\MersenneTwister
 { {
-        abstract public function generate(): int {}+    // uses 64-bit internally, if generated: 1 (zend_long), bits: 0000000000000000000000000000000000000000000000000000000000000001 (64-bit) 
 +    // normally MersenneTwister bits: 00000000000000000000000000000001 (32-bit) 
 +    public function generate(): int 
 +    { 
 +        return parent::generate() - 1; 
 +     }
 } }
 </code> </code>
  
-By defining class that extends Random\NumberGeneratorthe user can use their own random number generator. With the introduction of JIT in PHP 8.0, this can generate random numbers at realistic speed.+I don't think this is problemas most requests to generate random numbers in userland are likely to return a fixed value or reproduce specific scenario.
  
-<code php> +Random\Randomizer class will be added to manipulate data using these RNGs. 
-class UserDefinedRNG extends Random\NumberGenerator +
-+
-    protected int $current = 0;+
  
-    public function generate()int +This class provides the following methods:
-    { +
-        return ++$this->current; +
-    } +
-}+
  
-function foobar(Random\NumberGenerator $numberGenerator): void { +  * __constructor(\Random\NumberGenerator $generator = null[defaults to XorShift128Plus if null] 
-    for ($i = 0; $i < 9; $i++{ +  * getInt(int $min, int $max): int [replacement for mt_rand() / rand()] 
-        echo $numberGenerator->generate(); +  * getBytes(int $length): string [replacement for random_bytes()] 
-    } +  * shuffleArray(array $array): array [replacement for shuffle()] 
-}+  * shuffleString(string $string): string [replacement for str_shuffle()]
  
-foobar(new UserDefinedRNG()); // Results: 123456789 +Method equivalent to array_rand() was not implemented at this time because the implementation is complex and can be easily implemented in userland if necessary.
-</code>+
  
-It is also useful when you want to use a random number sequence with a fixed result, such as in testing.+The stubs of functionality provided by this extension are as follows:
  
-The Random class creates and uses an instance of the default random number generator, Random\NumberGenerator\XorShift128Plus, if the constructor argument is omitted.+https://github.com/colopl/php-src/blob/upstream_rfc/scoped_rng_for_pr/ext/random/random.stub.php
  
-XorShift128Plus is an efficient, high-quality algorithm used in modern browsers and other applications. This algorithm is capable of generating a wider range of random numbers in a 64-bit environment. In a 32-bit environment, the range beyond zend_long will simply be truncated. This indicates incompatibility between environments, but is acceptable for real-world use.+Examples of these uses are as follows:
  
-The Random\NumberGenerator\MT19937 class, which implements the MT19937 Mersenne twister, is also provided for backward compatibility or when a higher period is required. However, a 1-bit right shift is required to obtain exactly the same result as mt_rand(), as shown below. This is due to historical reasons.+<code php> 
 +// Use different RNGs for different environments. 
 +$rng = $is_production 
 +    ? new Random\NumberGenerator\Secure() 
 +    : new Random\NumberGenerator\XorShift128Plus(1234); 
 + 
 +$randomizer = new Random\Randomizer($rng); 
 +$randomizer->shuffleString('foobar')
 +</code>
  
 <code php> <code php>
-$seed = 1234;+// Safely migrate the existing mt_rand() state.
  
-$mt = new Random\NumberGenerator\MT19937($seed); +// before 
-mt_srand($seed); +mt_srand(1234, MT_RAND_PHP); 
-var_dump(mt_rand() ==($mt->generate() >> 1)); // true+foobar(); 
 +$result = str_shuffle('foobar'); 
 + 
 +// after 
 +$randomizer new Random\Randomizer(new Random\NumberGenerator\MersenneTwister(1234, MT_RAND_PHP)); 
 +foobar(); 
 +$result = $randomizer->shuffleString('foobar');
 </code> </code>
  
-The following NumberGenerator class supports serialization. Secure is not serializable because it uses random_bytes internally and has no state.+As a side effect of this RFC, the following PHP functions have been moved to the new ext/random extension
  
-  * Random\NumberGenerator\XorShift128Plus +This is because ext/standard/random.c reserves the name RANDOM and cannot be used by the extension. 
-  * Random\NumberGenerator\MT19937 +In addition, all RNG-related implementations will be moved to the new random extension in order to standardize the RNG implementation.
-  * Random\NumberGenerator extends user-defined classes.+
  
-Also, a new internal API will be implemented.+  * lcg_value() 
 +  * srand() 
 +  * rand() 
 +  * mt_srand() 
 +  * mt_rand() 
 +  * random_int() 
 +  * random_bytes()
  
-  * php_random_ng_next() +The following internal APIs will also be moved to the ext/random extension:
-  * php_random_ng_range() +
-  * php_random_ng_array_data_shuffle() +
-  * php_random_ng_string_shuffle()+
  
-A Stub showing these implementations can be found on the Pull-Request. It's probably easier to understand if you look at it.+  * php_random_int_throw() 
 +  * php_random_int_silent() 
 +  * php_combined_lcg() 
 +  * php_mt_srand() 
 +  * php_mt_rand() 
 +  * php_mt_rand_range() 
 +  * php_mt_rand_common() 
 +  * php_srand() 
 +  * php_rand() 
 +  * php_random_bytes() 
 +  * php_random_int()
  
-  * [[https://github.com/php/php-src/blob/7a4ef6ccfbf4a2cd48a4f261f2911ebb7b057d46/ext/random/random.stub.php|random.stub.php]]+All of these features are available from the extension by simply including a single ext/random/php_random.
 + 
 +The following header files are left in for extension compatibility. The contents all include ext/random/php_random.h. 
 + 
 +  * ext/standard/php_lcg.h 
 +  * ext/standard/php_rand.h 
 +  * ext/standard/php_mt_rand.
 +  * ext/standard/php_random.h
  
 ===== Future Scope ===== ===== Future Scope =====
  
-This proposal is just a first step to improve the situation of PHP's random number implementation.+These are not within the scope of this RFC, but are worth considering in the future:
  
-If this proposal is approved, I will then propose the following changes +  Remove old header files for compatibility (php_lcg.hphp_rand.hphp_mt_rand.h, php_random.h
- +  * Deprecate lcg_value()mt_srand(), srand()
-  Replace the state of the existing implementation with php_random_ng. +
-  * Changes random source to php_random_int() a shuffle()str_shuffle()and array_rand(. +
-  * Deprecate srand() and mt_srand() (step by step)+
  
 ===== Backward Incompatible Changes ===== ===== Backward Incompatible Changes =====
  
-The code that includes the following header file needs to be changed to ext/random/random.h +The following names have been reserved and will no longer be available
- +
-  * ext/standard/lcg.h +
-  * ext/standard/rand.h +
-  * ext/standard/mt_rand.h +
-  * ext/standard/random.h +
- +
-The following class names have been reserved and will no longer be available+
  
   * "Random"   * "Random"
   * "Random\NumberGenerator"   * "Random\NumberGenerator"
   * "Random\NumberGenerator\XorShift128Plus"   * "Random\NumberGenerator\XorShift128Plus"
-  * "Random\NumberGenerator\MT19937"+  * "Random\NumberGenerator\MersenneTwister" 
 +  * "Random\NumberGenerator\CombinedLCG"
   * "Random\NumberGenerator\Secure"   * "Random\NumberGenerator\Secure"
  
Line 236: Line 201:
  
 ==== To Existing Extensions ==== ==== To Existing Extensions ====
-none+In the future, it may be necessary to change the included header files to point to ext/random/php_random.h. However, compatibility will be maintained for now.
  
 ==== To Opcache ==== ==== To Opcache ====
Line 248: Line 213:
  
 ===== Open Issues ===== ===== Open Issues =====
-none+ 
 +=== It is not possible to reproduce 32-bit Mersenne Twister in userland in a 64-bit environment === 
 + 
 +https://github.com/php/php-src/pull/8094#pullrequestreview-881660425 
 + 
 +Currently, the width of PHP's NumberGenerator::generate() generation is implicitly assumed to be zend_long (the size of an int in PHP). 
 +This means that it is not possible to implement a RNG with a generation width other than 64-bit width. 
 + 
 +However, this RFC assumes that userland RNG implementations are often only used to reproduce certain scenarios in tests, and I personally think that this is sufficient. 
 + 
 +=== Should generate() really return a number? === 
 + 
 +Tim says NumberGenerator::generate() should return a string instead of an int. 
 + 
 +https://externals.io/message/117026#117032 
 + 
 +While it is true that returning a string allows for more flexibility in the range of RNG generation, it poses a convenience problem, In particular, it makes it difficult to implement a userland RNG to reproduce a particular scenario. 
 + 
 +=== Is adopting XorShift128Plus a good choices? === 
 + 
 +As mentioned in the Internals ML, there are a few known issues with XorShift128+. 
 + 
 +https://prng.di.unimi.it/ 
 + 
 +https://externals.io/message/117026#117030 
 + 
 +May need to consider a better candidate as an RNG to add.
  
 ===== Vote ===== ===== Vote =====
-Voting opens 2021-MM-DD and 2021-MM-DD at 00:00:00 EDT. 2/3 required to accept.+Voting opens 2022-MM-DD and 2021-MM-DD at 00:00:00 EDT. 2/3 required to accept.
  
-<doodle title="Add Random class" auth="zeriyoshi" voteType="single" closed="true"> +<doodle title="Add Random extension" auth="zeriyoshi" voteType="single" closed="true"> 
    * Yes    * Yes
    * No    * No
Line 259: Line 250:
  
 ===== Patches and Tests ===== ===== Patches and Tests =====
-  * https://github.com/php/php-src/pull/7453+  * https://github.com/php/php-src/pull/8094
rfc/rng_extension.txt · Last modified: 2022/08/01 16:52 by timwolla