rfc:remove_zend_api

This is an old revision of the document!


Remove reliance on Zend API

A better way to provide a C API, with particular emphasis on decoupling extensions from the interpreter.

Introduction

Naturally, this seems insane. Please bear with me.

What's the problem?

Zend API

The Zend API is a large set of functions, macros and data-structures which are used to interact with the Zend Engine. It serves 3 major purposes, roughly in order of importance:

  • Used to write PHP's standard libraries, 3rd party extensions, and much of PECL
  • Allows hot (performance-sensitive) code to be rewritten in C for speed
  • Used to embed PHP into within C/C++ applications using the embed SAPI

Problems

The main problem with it is that it constrains the implementation of the Zend Engine. The Zend API creates a tight coupling between the ZendEngine and its users, restricting greatly our ability to change the Zend Engine. By requiring backwards compatability with the Zend Engine, we are ensuring that the ZendEngine can only be modified in minor ways (although ABI changes are allowed for major versions). This holds the Zend Engine to design decisions made nearly 10 years ago, and prevent PHP from getting much faster in the long term.

The Zend API also makes it difficult to write PHP extensions. Although most of the API is not terribly difficult to work with, concepts like copy-on-write, change-on-write sets and separation appear to be tricky concepts for many people. The only documentation is Sara Golemon's book, and the actual code is not well commented. Although zend_parse_parameters has simplified the parameter parsing somewhat, I believe that a simpler way of writing extensions would be welcome.

A number of other PHP implementations exist, such as IBM's Project Zero, Phalanger, Roadsend, Quercus and phc. Many of these projects find it very difficult to re-use PHP's standard libraries. Quercus and Roadsend have reimplemented popular standard libraries. Phalanger and Project Zero attempt to re-use the existing libraries by marshalling their data into the Zend API. This appears to be slow and error-prone. phc is designed around reusing the Zend API for compatibility with the PHP. This constrains many of the optimizations phc would wish to perform, since it uses the Zend API nearly everywhere.

What's the solution?

Design Criteria

  • Greatly reduce the coupling between the Zend Engine and its users
  • Support all major use cases of the Zend API
    • preferably simplifying each use case

Solution

Take the use case of wrapping a C library to expose its functionality in user space. The major idea is to “automatically” import C functions into a special namespace. The PHP library functions would then be comprised of PHP user space code which calls those C functions directly. That way it is possible to craft an API that is separate from the C implementation.

Lets take a simple example. Assume we have a C library XXX, with 3 functions, x, y and z. We'd like to expose this in user space as a class called MyXXX, with methods a and b. We create a file with the signatures of x, y and z:

extensions/xxx/sigs.h

int x (int, int);
void y (char*, int);
void z (char*, int);

We then write our user space code:

extensions/xxx/MyXXX.php

class MyXXX
{
   function __construct ($username)
   {
       $this->username = $username;
   }

   function a ($w1, $w2)
   {
      $foo = \internals\XXX\x ($w1, $w2);
      \internals\XXX\y ($this->username, $foo);
   }

   function b ($m1, $m2)
   {
      $foo = \internals\XXX\x ($m1, $m2);
      \internals\XXX\z ($this->username, $foo);
      return $foo;
   }
}

In order to interface between these two, it will be necessary to have a tool to automatically wrap the C functions. SWIG could be used to create this tool.

Zend engine

Since the libraries would no longer use the Zend API, the tight coupling would be broken. It would now be possible to change major parts of the Zend engine without affecting the operation of any other part of PHP.

Extensions/PECL

It would no longer be necessary to know the Zend API to write extensions. Instead, only the API of the C library is necessary, and the interface can be created in PHP user code.

Embed SAPI

The same interface used for libraries can be used to handle many of the use cases of the C API. However, it is likely that a means to call PHP user code from C/C++ code, will be required.

Other PHP implementations

Since PHP extensions are no longer written in the Zend API, other PHP implementations, such as Roadsend, Project Zero, Phalanger and Quercus should be reuse the libraries without difficulty. In addition, if the coupling is between the interpreter and its components is simple enough, it may be possible for other implementations to be slotted in directly. However, though this would be a nice side-effect, it should probably not be considered a priority.

Project Plan

This is a simple design. In reality, it would need to be prototyped to determine whether this makes sense for every use case, and that there would be little sacrificed to make it work. The work on it should probably progress in roughly the following order:

  • Prototype a single library
    • perhaps readline?
    • Manually write interface code between the header and the PHP code.
  • Discuss requirements with other PHP implementations
  • Write a utility to generate the interface code automatically
  • Using SWIG?
  • Test 5 or 6 libraries
  • Test more complicated functionality
  • Convert entire set of PHP extensions

Naturally, before the last step it will be necessary to get consensus from other internals developers that this is a good idea. It would be worthwhile to produce a document discussing the experience so far.

rfc/remove_zend_api.1238453128.txt.gz · Last modified: 2017/09/22 13:28 (external edit)