Request for Comments: Better benchmarks for PHP
- Version: 1.0
- Date: 2009-02-01
- Author: Paul Biggar email@example.com, Nuno Lopes firstname.lastname@example.org
- Status: Started
Contributors: Add your name to the bottom of the page
Replace PHP's current bench.php with a benchmark suite from which meaningful performance measurements can be taken.
PHP's current bench.php is a micro-benchmark which tests a number of language features. Since it does not behave like a 'real' application, it cannot provide meaningful information about the performance of PHP in general. This RFC will attempt to replace bench.php with a better suite of benchmarks, upon which meaningful performance measurements can be taken.
What are we trying to measure
Performance impact of changes to PHP or its libraries
We need to be able to see how changes to the Zend engine affect real programs, not synthetic benchmarks.
We need realistic applications and workloads in order to get useful profiling info. Useful to know which pieces of the VM need optimization.
Performance improvement of other PHP implementations relative to PHP
There is a need for other PHP implementations to be able to measure their performance against the Zend engine.
What do we want in a benchmark
Benchmark applications should be straightforward to deploy, preferably with no manual setup, even when multiple machines are required.
- The benchmarks must be varied. We are not trying to measure features of PHP. We are trying to measure performance of applications.
- preferably few or no external dependencies
- It must be possible to separate results for the language, the database, the webserver and the benchmarking client.
- DBs could possibly be changed to use a file-based engine like SQLite.
- Web-apps benchmarks must have (real) data in order to simulate real work.
Basically we would like to see real-world apps tweaked so that automated deployment and testing is easy and simple. Each benchmark should also have a set of representative workload inputs (e.g. from real-world log files).
Benchmarks should be licensed under a liberal open-source license, so that they can be “safely” customized and redistributed for our needs.
THIS IS NOT SET IN STONE - FEEDBACK ESPECIALLY NEEDED
- Start with something simple
- Develop simple CLI benchmarks with the criteria “better than bench.php”
- Tools to analyse and compare test runs.
- Build a larger suite based on the infrastructure created from the first part
- A suite of about 10 web-apps, which can be deployed and run in a straightforward manner
- based on real world apps, with real world data
- Integrate with build infrastructure
- Automated summaries
- Graphs over time
What benchmarks currently exist
- The bench.php script just performs a few standard synthetic tests (e.g. ackermann, fibonacci, etc..). It also performs some tests on the areas that were optimized in Zend Engine 2 (so it's a little biased). However, it doesn't perform any 'real' work.
- There are many language features that it does not use, that might be used (perhaps sporadically) in a larger application:
- Variable-function or -method calls (in fact, the call-graph is entirely static, and very shallow -- although there are at least cycles).
- call-time pass-by-ref
- dynamic function/class definition
- include (etc)
- It seems that this test really only tests language features, and doesnt actually do any work.
- only performs a few synthectic micro tests to compare different ways of doing the same. e.g. benchmark the usage of '$a += $b' vs '$a = $a + $b'
- Provides a large number of small benchmarks
- Tests OO, and references
- Most of the benchmarks are very short (program length, not run-time) - the longest is 481 lines, and only 8 are longer than 100 lines.
- Some tests come with input data
- Short, doesnt take command line parameters, is varied, but simple.
- Many benchmarks are copied from the language shootout (old version thereof, apparently)
- Some tests come with input data
- very complete but very expensive
- contains 3 applications: banking website, e-commerce website and a vendor support website
- workload based on real data. more details at http://www.spec.org/web2005/docs/1.20/design/SPECweb2005_Design.html
- Requires a number of machines: for DB, front-end and clients. Difficult to deploy. Results difficult to replicate for other developers.
- This benchmarks the entire stack, not just PHP. If PHP is not the bottleneck, then the 'improvements' being tested will not appear. This can be an advantage - small improvements may not increase the speed of an web application in real life. However, it obscures what we are trying to measure - the speed of the PHP implementation.
- More information in paper: SPECweb2005®™ in the Real World: Using Internet Information Server (IIS) and PHP by Warner and Worley.
- Written in PHP 4, with no OO. Only minor changes required to make it run in PHP5.
- Difficult to deploy. Requires at least 3 machines (it can be deployed on a single machine, but the results would be worthless).
- Like SPECweb2005, the benchmarks the entire stack, not just PHP.
- Written in PHP 5. Has DB backends for MySQL, Oracle and SQLServer
- TODO: take a closer look to see if it can be used
- An auction site application modeled after eBay.
- Implementation variations in Java servlets and EJB are also available and compared in Middleware 2003 paper Cecchet et al. 2003].
Applications which could make good web-app benchmarks
Not much so far
- A fairly large, production-quality, still open-sourced CRM application.
- An extensive benchmark experimentation is published as a blog entry at Sun Microsystems.
- A Wiki engine famous for its use in Wikipedia.
- See Benchmarking of it.
- One of the most popular BBS applications written in PHP.
- Phalanger uses it for demonstrating its performance advantage.
Applications which could make good CLI benchmarks
- A tool for creating UML diagrams from PHP source
- Not very useful if it is IO bound
- Lots of data sets: can use any PHP package
- NOTE: Seems to require graphviz
- An odd language implementation
- Uses 00
- A small number of data sets (programs written in whirl)
Porting other benchmark suites
- These will probably take a day each to port
- The data sets are built into the applications
- The sunspider ports are probably easier to work with
- The data sets are (I think) built into the applications
- The most useful benchmarks, in order, appear to be (includes V8 benchmarks):
- Richards (V8)
- Deltablue (V8)
- V8-Crypto (V8)
- 3d-raytrace (ignoring the actual drawing from JS)
Benchmarking in other languages
* Python: http://code.google.com/p/unladen-swallow/wiki/ProjectPlan#Performance
Desired Benchmark Features
For command-line apps applications
- Memory usage
- Hardware performance counters (if available), using PAPI
- Simulated hardware statistics, using cachegrind
- These should be combined into a single representative number, using some hardware model.
- Ability to compare all of these over two runs
- Support for other PHP implementations
- Benchmark characterisation:
- This is hard to do properly, so best to do it badly and give coarse grained information.
- Requests per second
- Memory usage
- Bottleneck (is it scripting, DB, or network)
- Total time for request
- raytracer (28.04.2009)
- deltablue (04.05.2009)
- crypto (07.06.2009)
- whirl & j4p5
- crypto-md5 (29.06.2009)
- richards (29.06.2009)
The benchmarks are hosted in CVS: http://cvs.php.net/viewvc.cgi/php-benchmarks/. Checkout with 'cvs -d :pserver:email@example.com:/repository co php-benchmarks'. More info at http://php.net/anoncvs.php
If you don't have karma, you won't be able to edit the wiki, or commit benchmarks to the suite. Since this takes time, please do not wait for karma to contact us. If you have contributions to make, we'd like hear them, and can make changes to the wiki on your behalf.
To get karma, please fill in the form on http://php.net/cvs-php.php. In the text box, fill in 'Contributing benchmarks'. For 'Type of initial karma' enter 'PHP Group'. Please email Nuno (email below) when you have submitted the form. Once you get karma, please add your username (and other details, if missing), to the contributors section below.
Discussion on the QA mailing list (firstname.lastname@example.org). Please include [benchmarks] in the subject.
Add your name here if you want to help.
- Paul Biggar - paul.biggar [at] gmail.com
- Nuno Lopes - nlopess [at] php.net
- Ólafur Waage - olafurw [at] gmail.com
- Michiaki Tatsubori - mich [at] acm.org
- Alexander Hjalmarsson - hjalle [at] sgh.se
- Davide Mendolia - idaf1er [at] gmail.com