Integrating Zend Optimizer+ into the PHP distribution
- Version: 1.01
- Date: 2013-01-28
- Author: Zeev Suraski zeev@php.net
- Status: Implemented in PHP 5.5
- First Published at: http://wiki.php.net/rfc/optimizerplus
Introduction
This RFC proposes integrating the Zend Optimizer+ component into the Open Source PHP distribution. Optimizer+ is the fastest opcode cache available for PHP, and presently supports PHP 5.2 through 5.5, with public builds available for PHP 5.2 through 5.4. It was originally developed in 1998 and was the first opcode cache available for PHP.
Presently, Optimizer+ is a closed-source, yet free-for-use component. As a part of implementing this RFC - Zend will make the source code of Optimizer+ available under the PHP License, so that it can become an integrated part of PHP with no strings attached. Once that happens, community contribution would be welcome exactly like it is with any other PHP component, and the component will be governed by the exact same processes (RFC et. al) that are employed by the PHP community.
What is an Opcode Cache?
An opcode cache is a component that is designed to speed the performance of PHP without altering the behavior of applications in any way.
Without an opcode cache, every time PHP executes a .php file, it invokes the runtime compiler, generates an in-memory representation of the file (called intermediate code), and then invokes the executor on it.
Since on a given version of PHP, compiling the same .php file will always result in the exact same intermediate code - this creates an excellent use case for caching.
An opcode cache performs this exact task - it overrides PHP's default compiler callback; When invoked - it will check if a compiled intermediate-code version of the code is already available in-memory. If one exists - it will use it without invoking PHP's actual compiler, saving the overhead of compilation. If not - it will invoke PHP's internal compiler, generate the code, persist it in memory for future use (saving the need for subsequent compilations of the same file) - and then execute it.
Modern opcode caches (Optimizer+, APC 2.0+, others) use shared memory for storage, and can execute files directly off of it - without having to 'unserialize' the code before execution. This results in dramatic performance speedups, typically reduces the overall server memory consumption, and usually with very few downsides.
Interaction with other extensions and plugins
For the most part, the existence of an opcode cache should have no influence on extensions, except for ones that do 'brain surgey' on the Zend engine.
Debuggers
One such class of plugins are debugger extensions like Xdebug or Zend Debugger. Since debugger plugins instruct the engine to generate slightly different code - they actually can influence the behavior of an opcode cache in a negative manner. However, it should be easy to implement some minimal level of 'awareness' so that when Xdebug or Zend Debugger detect the presence of an opcode cache - they'll take measures to ensure that they don't clash with each other. Zend Debugger today implements it by simply overriding the compiler callback on its own, prior to Optimizer+ - and invokes the Optimizer+ compiler callback only for requests that don't have debugging enabled. It should be possible to implement the very same mechanism in Xdebug.
To simplify interaction between such modules, and in order to avoid the need for a strict loading order of modules - we may want to create a new mechanism in the engine that allows zend_extensions to specify the priority of overriding the compiler/executor callbacks. This is something we can look more deeply into if & when we integrate Optimizer+ into core.
Finalizing the mechanism is outside the scope of this RFC.
Other Components
The Optimizer+ component will not include any non-generic special support for any external components, Zend or otherwise. Generic support for persisting non-default data structures in shared memory might be added in the future, but is outside the scope of this RFC.
Alternatives
There’s one key alternative available for Optimizer+ - namely, APC. While architecturally similar, there are several pros and cons for choosing each of these components.
Advantages of Optimizer+ over APC
- Performance. Zend Optimizer+ has a consistent performance edge over APC, which, depending on the code, can range between 5 and 20% in terms of requests/second. See Benchmarks section below.
- Availability for new PHP versions. Optimizer+ is typically fully compatible with PHP releases even before they come out; While this advantage was rarely realized because of the closed-source nature of the component, once open-source, both Zend and the community will help ensure that it’s always fully compatible with every element of the PHP language, avoiding any lags.
- Reliability. Optimizer+ has optional corruption detection capabilities that can prevent a server-wide crash in case of data corruption (e.g. from a faulty implementation of a PHP function in C). This handles one of the very few downsides of using a shared-memory-based-opcode-cache - introducing a shared resource that - if corrupted - could bring down an entire server.
- Better compatibility. We strived to make Optimizer+ work with any and all constructs supported by PHP, in exactly the same way they’d behave without it.
Advantages of APC over Optimizer+
- Has a data caching API. APC has a data caching API which Optimizer+ does not have.
- APC can reclaim memory of old invalidated scripts. APC uses a memory manager and can reclaim memory associated with a script that is no longer in use; Optimizer+ works differently, and marks such memory as ‘dirty’, but never actually reclaims it. Once the dirty percentage climbs above a configurable threshold - Optimizer+ restarts itself. Note that this behavior has both stability advantages and disadvantages.
Benchmarks
All tests were done with the latest source tree of PHP 5.5.0 as of Jan 28 2013. We've tested plain PHP, APC 3.1.5-dev, Optimizer+ vanilla and Optimizer+ configured for extreme performance. Note that tuning for extreme performance may result in certain workflows and/or code structures to no longer work properly, so there's no plan to make these settings default. All tests were done on the same hardware, using PHP in FastCGI mode with 4 worker processes. We've tested numerous applications, both procedural and object oriented.
The results are available as a Google Spreadsheet:
Source Code
The Zend Optimizer+ source code has been made available under the PHP license, and can be found on github at http://bit.ly/VSsqx3
Naming
If the Optimizer+ components becomes embedded in PHP, it's likely that a name change will be in order. Finalizing the name is outside the scope of the RFC, but it's agreed that the name will be agreed upon by the internals@ community, either through consensus or a vote.
Recommendation
We can relatively easily work out the disadvantages of Optimizer+ and go with it. We could cooperate with the community to implement a different memory-reclaiming strategy if we ever choose to. In terms of a data caching API, there appears to be consensus that a userland data-caching API should be a separate component independently of any other decision.
Suggested Roadmap
- Make the source code available [DONE]
- Once the cleanup / initial improvements are done and everything is working & stable - bundle in PHP and move to ext/.
- Decide (on internals, or using a separate RFC/vote) whether to enable by default.
- Long term (beyond PHP 5.5), evaluate whether it makes sense to further integrate, and create tighter coupling with the Zend Engine.
PHP 5.5.0
If the RFC gets approved, one open question is whether or not we should aim for integrating Optimizer+ into the PHP 5.5.0 release. While integrating Optimizer+ could probably be done fairly quickly and without greatly delaying PHP 5.5.0’s timeline, it may require a 1-2 month delay. The question on the table is whether most users would prefer a slightly later release with an out-of-the-box working opcode cache, or a slightly earlier release without a compatible opcode cache available for several additional months. It should be noted that if we don’t integrate it in 5.5.0, based on the current timelines and versioning rules, the integration won’t happen before mid-late 2014.
The integration proposed for PHP 5.5.0 is mostly 'soft' integration. That means that there'll be no tight coupling between Optimizer+ and PHP; Those who wish to use another opcode cache will be able to do so, by not loading Optimizer+ and loading another opcode cache instead. As per the Suggested Roadmap above, we might want to review this decision in the future; There might be room for further performance or functionality gains from tighter integration; None are known at this point, and they're beyond the scope of this RFC.
Vote
Vote starts Feb 27th, and ends March 7th
Changelog
- 0.5 - Initial draft
- 0.6 - Added benchmarks
- 0.61 - Added clarification regarding ZTS
- 0.62 - Fixed alignment of this ChangeLog list
- 0.7 - Removed ZTS difference, now that ZTS is supported in the codebase
- 0.75 - Added 'What is an Opcode Cache?' and 'Interaction with other extensions and plugins' sections
- 0.80 - Source code now available! Added link
- 0.81 - Clarify debugger & other components support is outside scope of RFC, clarify 'extreme' settings'
- 0.82 - Added Naming section
- 0.83 - Clarify 5.5 integration
- 1.00 - Vote
- 1.01 - Clarify third voting option