php-7.1-ideas
Ideas for PHP 7.1
This is just a brainstorm (not a plan).
- Core Data Structures (the best performance improvement may be achieved by reducing data-set and improving data-locality)
- Try to reduce zval size
may be pack zval into 8 bytes using NaN tagging (probably may be done only for 32-bit systems, can't fit 64-bit integer into NaN tagged double, what to do with zval.u2?)[dmitry] [POSTPONED] (A PoC showed up to 20% memory consumption improvement but some slowdown. And lose of 64-bit integers)- may be pack few zvals into a single cortege (e.g 8 bytes for 8 types then 8*8 bytes for 8 values, looks complex). [dmitry]
- Hashtables:
- Try to combine HashTable.arData allocation together with zend_array allocation.
Make a special EG(empty_array) constant similar to EG(empty_string).[dmitry] [DONE] (in PHP-7.3)- Try to use different hash array representations. (e.g. uchar[] for nTableSize < 256 and uint32_t for bigger). [laruence]
- Try to avoid keeping keys for packed arrays. [laruence]
Try different HashTable load factors and growing strategies. (1/2 factor reduces number of hash collisions but increases hash-array size)[dmitry] [DONE] (in PHP-7.3)- Try switching to an open addressing implementation. If we reduce hashes to 32bit, this would save 8 bytes per bucket.
- Objects:
Improve guards implementation: Specialize the case of single guard active.[dmitry] DONE commit- Improve guards implementation: Don't keep around unused guard structures (currently leaks memory). Move guards outside of individual objects into a global HT.
- Improve iteration of objects: Avoid creating a properties HT if we don't need it.
- Use 32bit low pointers for rare structures like object handlers and class entries? Not familiar with how to make the necessary allocator guarantees (MAP_32BIT is not portable and not always work, for opline->opcode we may try to use relative address). The idea is simplar to HotSpot compressed oops) [dmitry]
- Strings:
- Store capacity to avoid reallocs. Integrate with allocator to choose next-largest allocation size.
- Split interned strings into few generations (PERSISTENT - SHM - emalloc-ed) [dmitry]
Packed strings. For short strings (3/7 chars on 32/64-bit systems), characters may be stored in zval itself.[andrea, dmitry] POSTPONED. A "PoC" disclosed troubles with hash collisions and dunging char* pointers.
- Oparrays:
- Introduce zend_compilation_uinit, combine op_array.literals of all op_arrays from compilation_unit into a single table. [laruence]
Think about more compact instruction format (moving zend_op.lineno into separate debug_info data, using 32-bit handler offset and reducing operand sizes to 16-bit)[dmitry] POSTPONED. A "dirty PoC" showed insignificant and inconsistent 1% speed improvement, it also makes about 10% reduction in number of L1 cache misses.- Try to make access to run-time-cache cheaper (see usage of private area in Code Sharing among Virtual Machines) [dmitry]
Change[dmitry] [DONE] (in PHP-7.3)zend_ast_ref
structure to use only one allocation, removing dichotomy between heap/arena ASTs. With that might also move away from our specialIS_CONSTANT
zvals and just use ASTs for everything lazy-evaluated.
- PHP Byte-Code Analysis and Optimization
- Refactor Opcache optimizer code and move it into Zend. [dmitry, laruence]
Reuse CFG, SSA construction from zend-jit project.[dmitry, laruence] [DONE]Reuse type and range inference from zend-jit project.[dmitry, laruence] [DONE]Introduce type specialized opcode handlers e.g. ADD_INT_NUMBER, ADD_INT_INT, ADD_INT_INT_NO_OVERFLOW, etc.[dmitry, nikic] [DONE]Try additional specialization for frequently executed opcodes (e.g. ASSIGN_NO_RET, DO_FCALL_NO_RET, etc). (VM code size increase doesn't affect performance in negative way)[bwoebi, dmitry] DONEEscape analysis[dmitry] [DONE] (in PHP-7.3)- Narrowing non-escaping HashTables to plain stack allocated arrays: zend_uchar[], zend_long[], double[], zend_string*[]. Implement corresponding specialized opcode handlers: FETCH_DIM_LONG, ASSIGN_DIM_LONG, etc. [nikic]
- Function inlining [nikic, dmitry]
- Try marking functions as compile-time evaluable where possible [instead of individual specialized handlers in compiler]
- Dataflow optimizations [nikic, dmitry]
Sparse conditional constant propagation[DONE] (in PHP-7.3)Dead code elimination (control dependent?)[DONE] (in PHP-7.3)- Global value numbering (RPO?)
- Copy propagation
- Scalar Replacement of Aggreagates [dmitry]
- Tail call elimination? (Should this be only where explicit - could cause debugging pain?)
- Bytecode Executor
Liveliness construction and usage to avoid memory leaks (Bob's idea, the general algorithm requires CFG construction)[dmitry] PR 1634 DONEUse main stack for fcalls inside Generators instead of growing a separate stack for each Generator.[bwoebi, dmitry] DONE- Try to find a way for cheaper exception handling. E.g. reuse C++ exception mechanisms for stack unwinding instead of endless checks for EG(exception) - HHVM already does this. [dmitry]
Find an efficient way instead of ZEND_TICK instruction (ideally a single mechanism for interrupts handling. the same mechanism may be reused for signals, coroutine switching, asynchronous functions, etc)[dmitry] DONEEliminate ZEND_FETCH_CLASS instructions accessing self:: parent:: static:: (Bob's idea)[dmitry] PR 1604 DONE- Think what can we do with call/return sequence improvement. INIT_FCALL/DO_FCALL/RETURN/leave_helper. Now these opcodes are the main CPU time consumers in real-life apps. [dmitry]
Pass arguments in reverse order.[bwoebi] PoC POSTPONED. This is a significant compication, but without large speed improvement.
- JIT (it doesn't make sense to invest into JIT, if/before significant changes in executor are planed/implemented)? [dmitry]
- Remove resource type in favor of (method-less to start?) objects holding the information
- This would present an opportunity to add new, friendlier method-based APIs for some things (e.g. cURL)
- New Features
Annotations: <<Name>> or <<Name(Value)>> where Value is a PHP constant or a PHP expression AST.[dmitry] PR 1878 REJECTED- Asynchronous Functions (need more research)? This isn't really doable at the current point, at least not before more code has non-blocking facilities... Also, streams layer refactor should come first.
Optional typing of object properties and variables (only if this makes improvement)?[joe, dmitry] REJECTED- Property accessors? (defeated before, but still a common request)
- Zend autoloader, supporting functions and constants
- Expose AST through extension. (php-ast, astkit) [nikic]
UString
class for Unicode strings, existing work by krakjoe, Andrea, others. Was intended for PHP 7.0 but didn't make it in. (RFC) [Andrea]
- Streams refactoring, libuv alike implementation could be a worthy goal [weltling]
- cleanup streams with duplicated and complicated implementations across the core
- isolate the actual concrete implementations by platforms (fe don't directly use file descriptors, SOCKETs, HANDLEs, etc.)
- create IO wrappers instead of using IO functions directly
- where possible and makes sense, move away from POSIX in favour of more up to date techniques (fe use epoll syscalls instead of select, use native Windows APIs)
- The aim of this is to create a modern portable IO layer, libuv alike. This serves to improve the streams related code quality, improve IO performance, prepare better with possible (even possibly pseudo-)concurrency improvements in the core. This improvement most likely can be done incrementally, starting with the generic streams structs and continuing with VCWD and plain wrapper, then spreading over the remaining code base.
- Continue on datatype cleanups in the core
- use unified datatypes as in 7.0, fe move php_int32 to be int32_t, etc. [weltling]
- implement better range checks, partially done in 7.0 Zend/zend_range_check.h [weltling]
- more checks are still needed for ZPP validations
- the continuation on range checks were to spread this onto the math operations in the pure core. Fe would x*y overflow zend_long range? [weltling]
- PCRE2
Switch ext/pcre to use pcre2lib[weltling] [DONE] (in PHP-7.3)
- Build chain
- Deploy transparent PGO trained builds based on trainings with synthetic workloads relevant for real life scenarios. The synthetic workloads will be integrated in the delivered source package. They should reproduce the complexity of WordPress, Drupal or MediaWiki in terms of used extensions and database acces. [Intel]
- Investigate possible ways of LTO deployment for getting a better code arrangement in memory based on execution hotness. [Intel]
php-7.1-ideas.txt · Last modified: 2018/05/08 10:38 by dmitry