rfc:object_scope_prng

This is an old revision of the document!


PHP RFC: Object scoped RNG Implementations.

Introduction

PHP currently provides the mt_srand() and mt_rand() functions based on the Meresenne Twister as PRNGs. However, since these functions keep their state in global space, unintended function calls may cause inconsistency even for the same seed value.

mt_srand(1234);
foo();
mt_rand() === 411284887; // false
 
function foo() {
    mt_rand(); // code added
}

This is inappropriate for applications that require consistency in the generated values (game logic, test code, etc.). These global states also affect the forked child processes. In fact, the EasySwoole Swoole extension based framework has been alerted to this behavior. [5]

In other languages, RNGs are implemented as objects, so this problem doesn't exists. [3] [4]

The global state of MT is also used by other functions that use random numbers, and this problem is further exacerbated when the consistency of the results is required by seeding with specific values. (Of course, you are right that such usage is a bad example.)

mt_srand(1234);
$arr = [1, 2, 3, 4, 5];
shuffle($arr);
echo print_r($arr, true); // This result is always consistent.
 
/*
Array
(
    [0] => 3
    [1] => 2
    [2] => 5
    [3] => 4
    [4] => 1
)
*/

Therefore, it may be a good idea to consider deprecating global state-dependent RNG functions in the future.

One currently possible userland solution is to implement the PRNG in pure PHP. There is actually a userland library [1], but it is not fast enough for PHP at the moment, including with JIT (Benchmark results are available at Open Issues.)

This implementation will also allow us to support the PHP paradigm, which will become even more complex in the future.

I have created an extension for PHP to improve these [2]. This could be used to consider what specific goals this RFC is trying to achieve.

Proposal

Implements an object-scoped PRNG in PHP, providing methods equivalent to functions that use RNG results.

Two methods have been proposed in the discussion.

Type I

First, it provides the following interface.

namespace RNG;
 
interface RNGInterface
{
    public function next(): int;
    /** @throws ValueError */
    public function next64(): int;
}

The next64() method throws a ValueError exception when running on a non-64bit architecture or not supported RNG class.

Next, define an interface that provides methods equivalent to PHP functions that use the output of the RNG.

namespace RNG;
 
interface RandomInterface extends RNGInterface
{
    public function arrayShuffle(array &$array): bool; // substitute array_shuffle() function.
    public function stringShuffle(string $string): string; // substitute str_shuffle() function.
    public function arrayRandom(array $array, int $num = 1): int|string|array; // substitute array_rand() function.
}

These methods are safe for the internal state of the RNG class.

Finally, we provide an RNG class that implements these interfaces. At this time, considering support for XorShift128+ , MT19937 and OSRNG.

namespace RNG;
 
class XorShift128Plus implements RandomInterface {} // use XorShift128+ algorithm
class MT19937 implements RandomInterface {} // use MT19937 algorithm
class OSRNG implements RandomInterface {} // use php_random_bytes() internal API

This implementation is superior to the Type II implementation in the following ways

  • Userland can provide an arbitrary RNGInterface / RandomInterface implementation.
  • Providing methods based on inheritance is user-friendly.
  • The implementation will be simple.

However, it is inferior in the following areas

  • Implementation will be redundant.
  • Inheritance patterns are not modern.
  • Unable to recognize incorrect usage of existing functions.

Type II

First, it provides the following interface.

namespace RNG;
 
interface RNGInterface
{
    public function next(): int;
    /** @throws ValueError */
    public function next64(): int;
}

The next64() method throws a ValueError exception when running on a non-64bit architecture or not supported RNG class.

Next, make some changes to the existing functions and add some new ones.

function shuffle(array &$array, ?RNGInterface $rng = null): bool {}
function str_shuffle(string $string, ?RNGInterface $rng = null): string {}
function array_rand(array $array, int $num = 1, ?RNGInterface $rng = null): int|string|array {}
/** Generates a random number for the specified range using RNG. */
function rng_range(RNGInterface $rng, int $min, int $max): int [}
/** Generates a sequence of bytes for a specified range using RNG. */
function rng_bytes(RNGInterface $rng, int $length): string {}

Existing RNG-specifying functions will now be able to explicitly specify the source of the RNG. This makes it clear that these functions are using the RNG internally.

Finally, we provide an RNG class that implements these interfaces. At this time, considering support for XorShift128+ , MT19937 and OSRNG.

namespace RNG;
 
final class XorShift128Plus implements RNGInterface {} // use XorShift128+ algorithm
final class MT19937 implements RNGInterface {} // use MT19937 algorithm
final class OSRNG implements RNGInterface {} // use php_random_bytes() internal API

This implementation is superior to the Type II implementation in the following ways

  • User will be able to notice in advance that the function uses RNG.
  • RandomInterface is not required.
  • Simplifies the internal implementation of the class.

However, it is inferior in the following areas

  • RNG instances as an argument confuses the user.
  • Implementations that handle userland RNGInterface implementations can be complex.
    • May not need to support this.

Backward Incompatible Changes

With the provides of new classes, some class names (or namespaces) will no longer be available in userland.

Proposed PHP Version(s)

8.1

RFC Impact

To SAPIs

none

To Existing Extensions

orng https://pecl.php.net/package/orng : it is a PECL extension that provides almost the same functionality. If the interface is provided by the core in the future, it will need to be supported. And that's me.

To Opcache

none

New Constants

none

php.ini Defaults

none

Open Issues

Why implement a method that is almost identical to a traditional function?

It is intended to improve interoperability with conventional code. Users can modify the implementation to be safe against state by simply replacing the existing code.

With JIT, won't the userland implementation reach a useful speed?

Comparing the speed of the userland implementation of XorShift128+ and the orng extension.

PHP 8.0

$ time php -r 'require __DIR__ . "/vendor/savvot/random/src/AbstractRand.php"; require __DIR__ . "/vendor/savvot/random/src/XorShiftRand.php"; $r = new Savvot\Random\XorShiftRand(1234); for ($i = 0; $i < 1000000; $i++) { $r->random(); }'
 
real	0m0.441s
user	0m0.429s
sys	0m0.010s

PHP 8.0 + OPcache JIT

$ time php -dopcache.jit_buffer_size=100M -dopcache.enable_cli=1 -r 'require __DIR__ . "/vendor/savvot/random/src/AbstractRand.php"; require __DIR__ . "/vendor/savvot/random/src/XorShiftRand.php"; $r = new Savvot\Random\XorShiftRand(1234); for ($i = 0; $i < 1000000; $i++) { $r->random(); }'
 
real	0m0.155s
user	0m0.139s
sys	0m0.015s

PHP 8.0 + orng

$ time php -r '$r = new \ORNG\XorShift128Plus(1234); for ($i = 0; $i < 1000000; $i++) { $r->next(); }'
 
real	0m0.056s
user	0m0.048s
sys	0m0.008s

This provides a significant improvement, but still slow from the C implementation.

Why do we need this feature in the core and not in the extension?

In order to use the features related to pseudo-random numbers that PHP currently provides, an understanding of the core is required. If this proposal is implemented, users will be able to use pseudo-random numbers under the easy to understand concept of objects. This is a useful improvement to the overall functionality of the language.

Unaffected PHP Functionality

It does not affect any related existing functions. (However, in the case of Type II, non-destructive arguments will be added.)

  • mt_srand()
  • mt_rand()
  • shuffle()
  • str_shuffle()
  • array_rand()

Proposed Voting Choices

Yes/No, requiring 2/3 majority

There are a few additional options for implementation.

Which implementation looks good?
Real name Type I Type II
Final result: 0 0
This poll has been closed.
To which namespace should these classes and interfaces belong?
Real name Top Level (\RNGInterface) "RNG" namespace (\RNG\RNGInterface) "PHP\RNG" namespace (\PHP\RNG\RNGInterface)
Final result: 0 0 0
This poll has been closed.

Patches and Tests

References

rfc/object_scope_prng.1610312124.txt.gz · Last modified: 2021/01/10 20:55 by zeriyoshi