rfc:rng_extension

This is an old revision of the document!


PHP RFC: Add Random class

Introduction

PHP is currently having problems with RNG reproducibility.

PHP's RNG has been unified into an implementation using the Mersenne twister, with the rand() and srand() functions becoming aliases for mt_rand() and mt_srand() respectively in PHP 7.1.

But, these functions still store the state in the global state of PHP and are not easily reproducible. Look at the following example.

echo foo(1234, function (): void {}) . PHP_EOL; // Result: 1480009472
echo foo(1234, function (): void { mt_rand(); }) . PHP_EOL; // Result: 1747253290
 
function foo(int $seed, callable $bar): int {
    mt_srand($seed);
    $result = mt_rand();
    $bar();
    $result += mt_rand();
    return $result;
}

As mentioned above, the reproducibility of random numbers can easily be lost if additional processing is added later.

In addition, the fiber extension was introduced in PHP 8.1. This makes it more difficult to keep track of the execution order. However, this problem has existed since the inception of Generator.

There is also the problem of functions that implicitly use the state stored in PHP's global state. shuffle(), str_shuffle(), and array_rand() functions implicitly advance the state of a random number. This means that the following code is not reproducible, but it is difficult for the user to notice this.

mt_srand(1234);
echo mt_rand() . PHP_EOL; // Result: 411284887
 
mt_srand(1234);
str_shuffle('foobar');
echo mt_rand() . PHP_EOL; // Result: 1314500282

Proposal

Implements Random class.

This class implement to ext/standard, always bundled PHP core.

The PHP code that represents the implementation is as follows:

const RANDOM_XORSHIFT128PLUS = 'xorshift128plus'; // fast, lightweight, default
const RANDOM_MT19937 = 'mt19937'; // for compatibility
const RANDOM_SECURE = 'secure'; // required Cryptographically-Secure PRNG
const RANDOM_USER = 'user'; // userland implementation
 
class Random
{
    public function __construct(string $algo = RANDOM_XORSHIFT128PLUS, ?int $seed = null);
 
    // For user.
    public static function getNonBiasedMax(string $algo): int;
    public function getInt(int $min = PHP_INT_MIN, int $max = PHP_INT_MAX): int;
    public function getBytes(int $length): string;
    public function shuffleArray(array $array): array;
    public function shuffleString(string $string): string;
 
    // For serialize / unserialize. (but, NOT always available.)
    public function __serialize(): array;
    public function __unserialize(array $data): void;
 
    // MUST override in RANDOM_USER.
    protected function next(): int;
}

This single class is used to provide the processing using PRNG.

This class switches the PRNG implementation to be used by the constructor argument $algo. It is just like the password_hash() function.

Also, the static method getNonBiasedMax() allows the user to get the non-biased RNG range.

This allows us to rewrite the first example as follows:

// example 1
echo foo(1234, function (): void {}) . PHP_EOL; // Result: 1480009472
echo foo(1234, function (): void { mt_rand(); }) . PHP_EOL; // Result: 1480009472
 
function foo(int $seed, callable $bar): int {
    $random = new Random(RANDOM_MT19937, $seed);
    $max = Random::getNonBiasedMax(RANDOM_MT19937);
    $result = $random->getInt(0, $max);
    $bar();
    $result += $random->getInt(0, $max);
    return $result;
}
 
// example 2
$random = new Random(RANDOM_MT19937, 1234);
$max = Random::getNonBiasedMax(RANDOM_MT19937);
echo $random->getInt(0, $max) . PHP_EOL; // Result: 411284887
 
$random = new Random(RANDOM_MT19937, 1234);
$max = Random::getNonBiasedMax(RANDOM_MT19937);
str_shuffle('foobar');
echo $random->getInt(0, $max) . PHP_EOL; // Result: 411284887

Similarly, several C APIs have been added to the PHP core. This can be used to add non-standard PRNGs.

// Note: The detailed implementation is tentative.
typedef struct _php_random_class_algo {
    int64_t max;
    int64_t (*next)(void *state);
    void (*seed)(void *state, const zend_long *seed); // allows NULL.
    int (*serialize)(void *state, zval *data); // allows NULL.
    int (*unserialize)(void *state, zval *data); // allows NULL.
    void *state;
} php_random_class_algo;
 
int php_random_class_algo_register(const char *ident, const php_random_class_algo *algo);
void php_random_class_algo_unregister(const char *ident);

In php_random_class_algo, the implementation of serialize / unserialize is optional. If the RNG you implement does not support it, you can specify NULL as a member of the structure, In that case, an exception will be thrown during serialization.

Also, for RNGs that do not (or cannot) use seed values, the function pointer for seed is optional. If this is passed to a null RNG, an exception will be thrown if a seed value is passed.

This class also supports the RNG implementation of userland. This is useful for tests that want to fix the expected calculation cost.

class FixedNumberForTest extends Random
{
    protected int $current = 0;
 
    public function __construct()
    {
        parent::__construct(RANDOM_USER, null);
    }
 
    protected function next(): int
    {
        return ++$this->current;
    }
}

Backward Incompatible Changes

The class name Random is reserved and will not be available in userland.

Proposed PHP Version(s)

8.1

RFC Impact

To SAPIs

none

To Existing Extensions

none

To Opcache

none

New Constants

  • RANDOM_XORSHIFT128PLUS
  • RANDOM_MT19937
  • RANDOM_SECURE
  • RANDOM_USER

php.ini Defaults

none

Open Issues

When $seed is null, what is used for the seed value?

Depends on the implementation of algo, but basically it is using internal php_random_int(). It is similar to mt_srand() from PHP 8.1.

- https://github.com/php/php-src/commit/53ee3f7f897f7ee33a4c45210014648043386e13

Why cancelled RNG Extension?

As a result of discussions during the draft, the functions became a single class and no longer need to be separated. The functionality for random numbers is now included in ext/standard and will conform to this. (e.g. rand.c random.c)

Why not take an object oriented approach?

This is because it is overly complex and difficult to use, See my previous proposal and the discussion in the internals ML for more details.

Why XorShift128+ as the default algorithm?

This algorithm is capable of generating 64-bit random numbers, is used by major browsers, and is well validated. On the other hand, MT19937, currently used by PHP, can only generate 32-bit random numbers.

Why keep the MT19937 implementation? 

This is for compatibility. It facilitates quick and easy migration.

What algorithm does RANDOM_SECURE use exactly?

It uses php_random_bytes() internally. This API is guaranteed to be a CSPRNG under any circumstances.

Why support CSPRNG? Isn't random_int() good enough?

The goal is to be able to migrate all RNG provided functions to this class in the future. In other words, to be able to write code without using any of the following functions:

  • srand()
  • rand()
  • mt_srand()
  • mt_rand()
  • shuffle()
  • str_shuffle()
  • array_rand()
  • random_int()
  • random_bytes()

In order to use these functions properly, you need to understand PHP core. For many users, this can be difficult.

Why isn't there a drop-in replacement API?

There is no API that can simply replace the following functions:

  • shuffle()
  • array_rand()

The approach of these functions is not compatible with recent implementations. shuffle() uses pass-by-reference, and array_rand() is too complex.

Why stop deprecation for some functions?

The following functions have been removed from deprecation:

  • srand()
  • rand()
  • mt_srand()
  • mt_rand()

This is because it is still too early and inappropriate include it in one RFC.

What will be the concrete C implementation?

Please wait. If the discussion in ML is good, I' ll start the implementation.

Vote

Voting opens 2021-MM-DD and 2021-MM-DD at 00:00:00 EDT. 2/3 required to accept.

Add Random class
Real name Yes No
Final result: 0 0
This poll has been closed.

Patches and Tests

TBD

rfc/rng_extension.1621957202.txt.gz · Last modified: 2021/05/25 15:40 by zeriyoshi