rfc:rng_extension
Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
rfc:rng_extension [2021/05/25 15:40] zeriyoshi str_shuffle has drop-in replacement API |
rfc:rng_extension [2022/02/14 10:42] zeriyoshi update 4.0 |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== PHP RFC: Add Random | + | ====== PHP RFC: Random |
- | * Version: | + | * Version: |
- | * Date: 2021-05-18 | + | * Date: 2022-02-14 |
- | * Author: Go Kudo < | + | * Author: Go Kudo < |
- | * Status: | + | * Status: |
+ | * Implementation: | ||
* First Published at: http:// | * First Published at: http:// | ||
===== Introduction ===== | ===== Introduction ===== | ||
- | PHP is currently having problems with RNG reproducibility. | ||
- | PHP's RNG has been unified into an implementation using the Mersenne twister, with the rand() and srand() functions becoming aliases for mt_rand() and mt_srand() respectively in PHP 7.1. | + | PHP implements several useful RNGs. However, they are currently only available in the global scope. |
- | But, these functions still store the state in the global state of PHP and are not easily reproducible. Look at the following example. | + | Mersenne Twister, PHP's default RNG, provides a function mt_srand() to initialize with a user-specified seed value, but the scope is global, which may cause unintended user behavior. |
- | <code php> | + | When a user executes mt_srand(), one would expect it to only affect result of mt_rand(), however, the following functions implicitly affect the result of mt_rand() |
- | echo foo(1234, function | + | |
- | echo foo(1234, function (): void { mt_rand(); }) . PHP_EOL; // Result: 1747253290 | + | |
- | function foo(int $seed, callable $bar): int { | + | * shuffle() |
- | | + | * str_shuffle() |
- | | + | * array_rand() |
- | $bar(); | + | |
- | $result += mt_rand(); | + | |
- | return $result; | + | |
- | } | + | |
- | </ | + | |
- | + | ||
- | As mentioned above, the reproducibility of random numbers can easily be lost if additional processing is added later. | + | |
- | + | ||
- | In addition, the fiber extension was introduced in PHP 8.1. This makes it more difficult to keep track of the execution order. However, this problem has existed since the inception of Generator. | + | |
- | There is also the problem | + | For example, in the following code, the result |
<code php> | <code php> | ||
mt_srand(1234); | mt_srand(1234); | ||
- | echo mt_rand() | + | $next = mt_rand(); |
mt_srand(1234); | mt_srand(1234); | ||
- | str_shuffle(' | + | $arr = range(0, 9); |
- | echo mt_rand() | + | shuffle($arr); |
+ | $next2 = mt_rand(); | ||
+ | |||
+ | die(" | ||
</ | </ | ||
+ | |||
+ | These behaviors were unintuitive and often led to unintended execution results, but were not that problematic for general web application use. | ||
+ | |||
+ | However, in more complex and repeatable applications (such as games), this can be a problem. | ||
+ | |||
+ | There is also the issue of state management difficulties with Fiber, which was added in PHP 8.1. Nikita had this to say: | ||
+ | |||
+ | https:// | ||
+ | |||
+ | In addition, the Mersenne Twister, can only generate 32-bit values. | ||
+ | In recent years, many of the environments where PHP runs have been migrating to 64-bit platforms. | ||
+ | In order to generate more secure values, an RNG that can generate 64-bit wide values should be provided by the language. | ||
===== Proposal ===== | ===== Proposal ===== | ||
- | Implements Random class. | ||
- | This class implement to ext/ | + | Implement the XorShift128Plus algorithm for generating new 64-bit wide random numbers, along with a random extension that includes an object scope RNG, and bundle it with PHP. |
+ | XorShift128Plus is a fast, high-quality RNG that is proven in major web browsers. | ||
+ | Many of the major hardware architectures are now 64-bit, so it makes sense to use this RNG. | ||
- | The PHP code that represents | + | In addition to the new algorithm, the following classes will be added to fix the global scope issue. |
- | <code php> | + | * class Random\NumberGenerator\XorShift128Plus |
- | const RANDOM_XORSHIFT128PLUS = ' | + | * class Random\NumberGenerator\MersenneTwister |
- | const RANDOM_MT19937 = ' | + | * class Random\NumberGenerator\CombinedLCG |
- | const RANDOM_SECURE = ' | + | * class Random\NumberGenerator\Secure |
- | const RANDOM_USER = ' | + | |
- | class Random | + | These classes will hold independent RNG state and will not affect the global scope. |
- | { | + | |
- | public function __construct(string $algo = RANDOM_XORSHIFT128PLUS, | + | |
- | // For user. | + | An interface Random\NumberGenerator is also added and are implmeneted by the classes above. |
- | | + | This interface has only a single generate() method which makes it possible to switch between RNG implementations depending on the situation, |
- | | + | allowing alternative implementations to be done by PHP in userland. This is useful, for example, for running tests. |
- | public function getBytes(int $length): string; | + | |
- | public function shuffleArray(array $array): array; | + | |
- | public function shuffleString(string $string): string; | + | |
- | // For serialize / unserialize. (but, NOT always available.) | + | RNGs other than XorShift128Plus are based on the RNGs currently implemented in PHP. |
- | public function __serialize(): | + | |
- | public function __unserialize(array $data): void; | + | |
- | // MUST override in RANDOM_USER. | + | The Random\Randomizer class will be added to manipulate data using these RNGs. |
- | protected function next(): int; | + | |
- | } | + | |
- | </ | + | |
- | This single | + | This class provides |
- | This class switches the PRNG implementation | + | * __constructor(\Random\NumberGenerator $generator = null) [defaults |
+ | * getInt(int | ||
+ | * getBytes(int $length): string [replacement for random_bytes()] | ||
+ | * shuffleArray(array $array): array [replacement for shuffle()] | ||
+ | * shuffleString(string $string): string [replacement for str_shuffle()] | ||
- | Also, the static method getNonBiasedMax() allows the user to get the non-biased RNG range. | + | Method equivalent to array_rand() was not implemented at this time because |
- | This allows us to rewrite the first example | + | Examples of these uses are as follows: |
<code php> | <code php> | ||
- | // example 1 | + | // Use different RNGs for different environments. |
- | echo foo(1234, function | + | $rng = $is_production |
- | echo foo(1234, function (): void { mt_rand(); }) . PHP_EOL; // Result: 1480009472 | + | ? new Random\NumberGenerator\Secure() |
+ | | ||
- | function foo(int $seed, callable $bar): int { | + | $randomizer |
- | | + | $randomizer->shuffleString(' |
- | $max = Random:: | + | </ |
- | $result = $random->getInt(0, $max); | + | |
- | | + | <code php> |
- | | + | // Safely migrate the existing mt_rand() state. |
- | return $result; | + | |
- | } | + | |
- | // example 2 | + | // before |
- | $random = new Random(RANDOM_MT19937, 1234); | + | mt_srand(1234, MT_RAND_PHP); |
- | $max = Random:: | + | foobar(); |
- | echo $random-> | + | $result = str_shuffle(' |
- | $random | + | // after |
- | $max = Random:: | + | $randomizer |
- | str_shuffle(' | + | foobar(); |
- | echo $random->getInt(0, $max) . PHP_EOL; // Result: 411284887 | + | $result = $randomizer->stringShuffle(' |
</ | </ | ||
- | Similarly, several C APIs have been added to the PHP core. This can be used to add non-standard PRNGs. | + | As a side effect of this RFC, the following PHP functions |
- | <code c> | + | This is because ext/standard/random.c reserves the name RANDOM and cannot be used by the extension. |
- | // Note: The detailed implementation | + | In addition, all RNG-related implementations will be moved to the new random extension in order to standardize the RNG implementation. |
- | typedef struct _php_random_class_algo { | + | |
- | int64_t max; | + | |
- | int64_t (*next)(void *state); | + | |
- | void (*seed)(void *state, const zend_long *seed); | + | |
- | int (*serialize)(void *state, zval *data); // allows NULL. | + | |
- | int (*unserialize)(void *state, zval *data); // allows NULL. | + | |
- | void *state; | + | |
- | } php_random_class_algo; | + | |
- | int php_random_class_algo_register(const char *ident, const php_random_class_algo | + | * lcg_value() |
- | void php_random_class_algo_unregister(const char *ident); | + | |
- | </ | + | |
+ | * mt_srand() | ||
+ | | ||
+ | * random_int() | ||
+ | * random_bytes() | ||
- | In php_random_class_algo, | + | The following internal APIs will also be moved to the ext/random extension: |
- | Also, for RNGs that do not (or cannot) use seed values, the function pointer for seed is optional. If this is passed to a null RNG, an exception will be thrown if a seed value is passed. | + | * php_random_int_throw() |
+ | * php_random_int_silent() | ||
+ | * php_combined_lcg() | ||
+ | * php_mt_srand() | ||
+ | * php_mt_rand() | ||
+ | * php_mt_rand_range() | ||
+ | * php_mt_rand_common() | ||
+ | * php_srand() | ||
+ | * php_rand() | ||
+ | * php_random_bytes() | ||
+ | * php_random_int() | ||
- | This class also supports the RNG implementation | + | All of these features are available from the extension by simply including a single ext/ |
- | <code php> | + | The following header files are left in for extension compatibility. The contents all include ext/ |
- | class FixedNumberForTest extends Random | + | |
- | { | + | |
- | protected int $current = 0; | + | |
- | public function __construct() | + | * ext/ |
- | { | + | * ext/ |
- | | + | * ext/ |
- | } | + | * ext/ |
- | protected function next(): int | + | ===== Future Scope ===== |
- | { | + | |
- | | + | These are not within the scope of this RFC, but are worth considering in the future: |
- | } | + | |
- | } | + | * Remove old header files for compatibility (php_lcg.h, php_rand.h, php_mt_rand.h, |
- | </ | + | * Deprecate lcg_value(), |
===== Backward Incompatible Changes ===== | ===== Backward Incompatible Changes ===== | ||
- | The class name Random is reserved and will not be available | + | |
+ | The following names have been reserved and will no longer | ||
+ | |||
+ | * " | ||
+ | * " | ||
+ | * " | ||
+ | * " | ||
+ | * " | ||
+ | * " | ||
===== Proposed PHP Version(s) ===== | ===== Proposed PHP Version(s) ===== | ||
- | 8.1 | + | 8.2 |
===== RFC Impact ===== | ===== RFC Impact ===== | ||
Line 158: | Line 166: | ||
==== To Existing Extensions ==== | ==== To Existing Extensions ==== | ||
- | none | + | In the future, it may be necessary to change the included header files to point to ext/ |
==== To Opcache ==== | ==== To Opcache ==== | ||
Line 164: | Line 172: | ||
==== New Constants ==== | ==== New Constants ==== | ||
- | * RANDOM_XORSHIFT128PLUS | + | none |
- | * RANDOM_MT19937 | + | |
- | * RANDOM_SECURE | + | |
- | * RANDOM_USER | + | |
==== php.ini Defaults ==== | ==== php.ini Defaults ==== | ||
Line 173: | Line 178: | ||
===== Open Issues ===== | ===== Open Issues ===== | ||
- | + | none | |
- | === When $seed is null, what is used for the seed value? === | + | |
- | Depends on the implementation of algo, but basically it is using internal php_random_int(). | + | |
- | It is similar to mt_srand() from PHP 8.1. | + | |
- | + | ||
- | - https:// | + | |
- | + | ||
- | === Why cancelled RNG Extension? === | + | |
- | As a result of discussions during the draft, the functions became a single class and no longer need to be separated. | + | |
- | The functionality for random numbers is now included in ext/ | + | |
- | + | ||
- | === Why not take an object oriented approach? === | + | |
- | This is because it is overly complex and difficult to use, See my previous proposal and the discussion in the internals ML for more details. | + | |
- | + | ||
- | * https:// | + | |
- | * https:// | + | |
- | * https:// | + | |
- | + | ||
- | === Why XorShift128+ as the default algorithm? === | + | |
- | This algorithm is capable of generating 64-bit random numbers, is used by major browsers, and is well validated. On the other hand, MT19937, currently used by PHP, can only generate 32-bit random numbers. | + | |
- | + | ||
- | === Why keep the MT19937 implementation? | + | |
- | This is for compatibility. It facilitates quick and easy migration. | + | |
- | + | ||
- | === What algorithm does RANDOM_SECURE use exactly? === | + | |
- | It uses php_random_bytes() internally. This API is guaranteed to be a CSPRNG under any circumstances. | + | |
- | + | ||
- | === Why support CSPRNG? Isn't random_int() good enough? === | + | |
- | The goal is to be able to migrate all RNG provided functions to this class in the future. | + | |
- | In other words, to be able to write code without using any of the following functions: | + | |
- | + | ||
- | * srand() | + | |
- | * rand() | + | |
- | * mt_srand() | + | |
- | * mt_rand() | + | |
- | * shuffle() | + | |
- | * str_shuffle() | + | |
- | * array_rand() | + | |
- | * random_int() | + | |
- | * random_bytes() | + | |
- | + | ||
- | In order to use these functions properly, you need to understand PHP core. For many users, this can be difficult. | + | |
- | + | ||
- | === Why isn't there a drop-in replacement API? === | + | |
- | There is no API that can simply replace the following functions: | + | |
- | + | ||
- | * shuffle() | + | |
- | * array_rand() | + | |
- | + | ||
- | The approach of these functions is not compatible with recent implementations. | + | |
- | shuffle() uses pass-by-reference, | + | |
- | + | ||
- | === Why stop deprecation for some functions? === | + | |
- | The following functions have been removed from deprecation: | + | |
- | + | ||
- | * srand() | + | |
- | * rand() | + | |
- | * mt_srand() | + | |
- | * mt_rand() | + | |
- | + | ||
- | This is because it is still too early and inappropriate include it in one RFC. | + | |
- | + | ||
- | === What will be the concrete C implementation? | + | |
- | Please wait. If the discussion in ML is good, I' ll start the implementation. | + | |
===== Vote ===== | ===== Vote ===== | ||
- | Voting opens 2021-MM-DD and 2021-MM-DD at 00:00:00 EDT. 2/3 required to accept. | + | Voting opens 2022-MM-DD and 2021-MM-DD at 00:00:00 EDT. 2/3 required to accept. |
- | <doodle title=" | + | <doodle title=" |
* Yes | * Yes | ||
* No | * No | ||
Line 247: | Line 189: | ||
===== Patches and Tests ===== | ===== Patches and Tests ===== | ||
- | TBD | + | * https:// |
rfc/rng_extension.txt · Last modified: 2022/06/14 00:00 by zeriyoshi