Table of Contents

VM extension API Scratchpad

This is an area where a number of us are brainstorming and developing ideas around the proposal for an improved interface between the PHP Virtual Machine and its extensions. Some of us have worked on the ProjectZero implementation of PHP so we refer to our experiences there.

Assumptions

List of problems to solve

Direct manipulation of Zend Engine internal data structures without using macros

In most case macros exist to access the ZE data structures such as zvals. By using these macros it is easy to implement the programming interface without matching the layout of the data structures byte for byte. Unfortunately there is also plenty of extension code that accesses the data structures such as zvals directly without using macros.

Storage Allocation

This is the most pernicious problem of all.

The zend engine recognizes two types of memory. There is memory that is allocated “persistently” (pemalloc) that is to say it can persist from request to request and there non persistently allocated (emalloc). Typically this memory is associated with a zval which participates in the engine garbage collection (a reference counting scheme). If you think about the interaction with an extension there are actually really three cases:

  1. Memory that is used only during a single extension function call. At the end of the call the extension retains no references to the memory. Anything that must persist beyond the function call has been stored away inside the VM.
  2. Memory that is allocated from the temporary heap on one extension function call but accessed on a later call, e.g the extension caches a pointer to the memory across requests in extension global storage or a resource.
  3. Persistent memory that persists from request to request.

Case 2 causes a problem if we assume that we do not want the extensions to participate in the VM garbage collection scheme. There is an example of this in the XML extension function xml_set_object

ALLOC_ZVAL(parser->object);
*parser->object = **mythis;
zval_copy_ctor(parser->object);
INIT_PZVAL(parser->object);

The above code creates a new zval and sets a reference to it in an XML parser resource which is passed across requests.

Solution used in Project Zero

Projectzero assumes case 1 for all function calls. i.e all non-persistent memory is freed. ProjectZero had to modify the extension code specifically to remove instances of case 2.

Use of HashTable to represent PHP arrays and also as a library utility function.

Solution used in Project Zero

Extension code which manipulates the contents of HashTables directly

The array extension contains a great deal of code that manipulates hash tables directly. The approach taken in ProjectZero was to recode this as an “internal extension” in Java.

Pass by reference and return by reference

A simple scheme for dealing with pass by reference is to copy back any changes to the referenced value into the VM at the end of an extension function call. This simple scheme does not cope with scenarios where an extesion function returns a reference to a parameter that was passed by reference. Nor does it deal with a situation where a passed reference must be inserted into an array by reference. However, project zero has yet to actually encounter any extension which actually does this (other than the array extension which we coded internally in project zero)

Access to VM global values

Extensions contain a great deal of code which accesses the various globals (EG,CG etc.) Fortunately so far this code has all been easy to deal with by using macro substitution to replace the accesses with getter/setter call across the VM interface.

Size of the interface and duplication

The existing ZE interface contains a great deal of duplication. There are several methods and macros to do almost everything. The approach taken by project zero has been to map these macros to a small set of core APIs which are structured around a table of function pointers similar to JNI.

Extensions which write to $php_errormsg

http://www.php.net/manual/en/reserved.variables.phperrormsg.php

Folks involved in the discussion

We do not want to spam internals with hundreds of emails about our musings as we bounce ideas around but we do want to do this in the open hence this page. The folks currently involved in the discussion are as follows. If you are interested in the discussion or feel we are going about things in the wrong way please contact us: