rfc:preload

This is an old revision of the document!


PHP RFC: Preloading

Introduction

PHP has been using opcode caches for ages (APC, Turck MMCache, Zend Opcahce). They achieve significant performance boost by ALMOST completely eliminating the overhead of PHP code recompilation. With an opcode cache, files are compiled once (on the first request that uses them), and are there stored in shared memory. All the following HTTP requests use cached in shared memory representation.

This proposal is about the “ALMOST”, mentioned above. Despite, while storing files in an opcode cache eliminates the compilation overhead - there is still cost associated with fetching a file from the cache and into a specific request's context. We still have to check if the source file was modified, copy certain parts of classes and functions from the shared memory cache to the process memory, etc. Notably, since each PHP file is compiled and cached completely independently from any other file - we can't resolve dependencies between classes stored in different files as we store the files in the opcode cache, and have to re-link the class dependencies at run-time on each request.

This proposal is inspired by the “Class Data Sharing” technology designed for Java HotSpot VM. It aims to provide users with the ability to trade in some of the flexibility that the conventional PHP model provides them - for increased performance. On server startup - before any application code is run - we may load a certain PHP files into memory - and make their contents “permanently available” to all subsequent requests that will be served by that server. All the functions and classes defined in these files will be available to requests out of the box, exactly like internal entities (e.g. strlen() or Exception). In this way, we may preload entire or partial frameworks, and even the entire application class library. It will also allow for introducing “built-in” functions that will be written in PHP (similar to HHVM's sytemlib). The traded-in flexibility would include the inability to update these files once the server has been started (updating these files on the filesystem will not do anything; A server restart will be required to apply the changes); And also, this approach will not be compatible with servers that host multiple applications, or multiple versions of applications - that would have different implementations for certain classes with the same name - if such classes are preloaded from the codebase of one app, it will conflict with loading the different class implementation from the other app(s).

Proposal

Preloading is going to be controlled by just a single new php.ini directive - opcache.preload. Using this directive we will specify a single PHP file - which will perform the preloading task. Once loaded, this file is then fully executed - and may preload other files, either by including them or by using the opcache_compile_file() function. Previously, I tried to implement a reach DSL to specify, which files to load, which ignore, using pattern matching etc, but then realized that writing the preloading scenarios in PHP itself was much more simple and much more flexible.

For example the following script introduces a helper function, and uses it to preload the whole Zend Framework.

<?php
function _preload($preload, string $pattern = "/\.php$/", array $ignore = []) {
  if (is_array($preload)) {
    foreach ($preload as $path) {
      _preload($path, $pattern, $ignore);
    }
  } else if (is_string($preload)) {
    $path = $preload;
    if (!in_array($path, $ignore)) {
      if (is_dir($path)) {
        if ($dh = opendir($path)) {
          while (($file = readdir($dh)) !== false) {
            if ($file !== "." && $file !== "..") {
              _preload($path . "/" . $file, $pattern, $ignore);
            }
          }
          closedir($dh);
        }
      } else if (is_file($path) && preg_match($pattern, $path)) {
        if (!opcache_compile_file($path)) {
          trigger_error("Preloading Failed", E_USER_ERROR);
        }
      }
    }
  }
}
 
set_include_path(get_include_path() . PATH_SEPARATOR . realpath("/var/www/ZendFramework/library"));
_preload(["/var/www/ZendFramework/library"]);

As mentioned above, preloaded files remain cached in opcache memory forever. Modification of their corresponding source files won't make effect without another server restart. All functions and most classes defined in these files will be permanently loaded into PHP's function and class tables and become permanently available in the context of any future request. During preloading, PHP also resolves class dependencies and links with parent, interfaces and traits. It also removes unnecessary includes and performs some other optimizations.

Preloading Limitation

Only classes without unresolved parent, interfaces, traits and constant values may be preloaded. If a class doesn't satisfy to this condition, it's stored in opcache SHM as a part of corresponding PHP script in the same way as without preloading. Also, only top-level entities that are not nested within control structures (e.g. if()…) may be preloaded.

Implementation Details

Preloading is implemented as a part of the opcache on top of another (already committed) patch that introduces “immutable” classes and functions. They assume that immutable part is stored in shared memory once (for all processes) and never copied to process memory, but variable part is specific for each process. The patch introduced the MAP_PTR pointer data structure, that allows pointers from SHM to process.

Backward Incompatible Changes

Preloading does not affect any functionality unless it is explicitly used. However, if used, it may break some application behavior, because preloaded classes and functions are always available, and function_exists() or class_exists() checks would return TRUE, preventing execution of expected code paths. As mentioned above, incorrect usage on a server with more than one app could also result in failures. As different apps (or different versions of the same app) may have the same class/function names in different files, if one version of the class is preloaded - it will prevent loading of any other version of that class defined in a different file.

Proposed PHP Version(s)

PHP 7.4

RFC Impact

To Opcache

Preloading is implemented as a part of opcache.

php.ini Defaults

  • opcache.preload - specifies a PHP script that is going to be compiled and executed at server start-up.

Open Issues

  • preloading in ZTS build is not supported yet

Performance

Using preloading without any code modification I got ~30% speed-up on ZF1_HelloWorld (3620 req/sec vs 2650 req/sec) and ~50% on ZF2Test (1300 req/sec vs 670 req/sec) reference applications. However, real world gains will depend on the ratio between the bootstrap overhead of the code and the runtime of the code, and will likely be lower. This will likely provide the most noticeable gains with requests with short very runtimes, such as microservices.

Future Scope

  • Preloading may be used as systemlib in HHVM to define “standard” functions/classes in PHP
  • It might be possible to pre-compile the preload script and use a binary-form (may be even native .so or .dll) to speed-up server start-up.
  • In conjunction with ext/FFI (dangerous extension), we may allow FFI functionality only in preloaded PHP files, but not in regular ones
  • It's possible to perform more aggressive optimizations and generate better JIT code for preloaded function and classes (similar to HHVM Repo Authoritative mode in HHVM)

Proposed Voting Choices

The RFC requires 50%+1 majority

Patches and Tests

The pull request for RFS is at: https://github.com/php/php-src/pull/3538

Implementation

After the project is implemented, this section should contain

  1. the version(s) it was merged into
  2. a link to the git commit(s)
  3. a link to the PHP manual entry for the feature
  4. a link to the language specification section (if any)

References

Rejected Features

Keep this updated with features that were discussed on the mail lists.

rfc/preload.1539937104.txt.gz · Last modified: 2018/10/19 08:18 by dmitry