Table of Contents

This is a bottom-top approach to Zend 2 objects.

Despite the wording, this document is not a specification, it results from analyzing the PHP implementation. Since it attempts to extract general rules from specific code snippets, it may contain wrong inferences. If you find errors, correct them (requires permission to edit the wiki) or send an e-mail to glopes ~at~ nebm.ist.utl.pt.

Definitions

We'll deal with three separate entities here: references, objects and classes. Classes represent a type and define the behavior of all the objects (the instances of that class) of that type. Each object will typically have one or more references to it. These are abstract concepts; in terms of implementation in PHP, references are mapped to object zvals, and the two terms are here used interchangeably.

Internally, the word object is overloaded to also mean “references” -- to avoid confusion, I shall not use the term object to refer to references, reserving it to the objects (either the concept or the storage of their data in memory). Additionally, the term type may mean having a certain handler table. Two classes may have different behavior (different methods, etc.) and share the same handler table. Finally, the term reference, when applied to a zval, may also mean the zval is part of a reference set (Z_ISREF) -- the ambiguity should be cleared in the context.

The object zval

A zval that is an object reference will have the type IS_OBJECT. In this case, its value, which can be retrieved with Z_OBJVAL, will be of type zend_object_value, which is defined like this:

typedef struct _zend_object_value {
    zend_object_handle handle;        /* retrieve with Z_OBJ_HANDLE(zval) */
    zend_object_handlers *handlers;   /* retrieve with Z_OBJ_HT(zval) */
} zend_object_value;

The field handle (which can be retrieved with the Z_OBJ_HANDLE macro family) identifies the object to which the reference refers among those of the same type; zend_object_handle is actually just an integer -- that is, an object is uniquely identified by an integer and a zend_object_handlers structure. Consequently, two references are identical in the sense of the === operator if and only if they share both the handlers and the handle.

The handlers field has another purpose besides identifying the referred object. The structure it points to (the handler table) defines, at a low level, the behavior of the objects of that type. A specific handler function can be accessed with Z_OBJ_HANDLER(zval, hf).

The lifecycle of the references is associated with that of the objects. We likely want the object to be destroyed once there are no more references to it. The handler table has two entries for that purpose: add_ref and del_ref. The first should be called when a new reference for that object is created; the second when is deleted. Functions like zval_copy_ctor, zval_ptr_dtor and zval_dtor take care of calling the handler. So there are two types of refcounts one should have in mind when dealing with objects -- those of references (i.e., the zvals) and those of the objects themselves.

The handler table

Let's now explore the members of the handler table, which define the behavior of a class at a low level. Once we introduce the zend standard object, we'll see what their default values are (TODO).

typedef struct _zend_object_handlers {
    /* general object functions */
    zend_object_add_ref_t              add_ref;
    zend_object_del_ref_t              del_ref;
    zend_object_clone_obj_t            clone_obj;
    /* individual object functions */
    zend_object_read_property_t        read_property;
    zend_object_write_property_t       write_property;
    zend_object_read_dimension_t       read_dimension;
    zend_object_write_dimension_t      write_dimension;
    zend_object_get_property_ptr_ptr_t get_property_ptr_ptr;
    zend_object_get_t                  get;
    zend_object_set_t                  set;
    zend_object_has_property_t         has_property;
    zend_object_unset_property_t       unset_property;
    zend_object_has_dimension_t        has_dimension;
    zend_object_unset_dimension_t      unset_dimension;
    zend_object_get_properties_t       get_properties;
    zend_object_get_method_t           get_method;
    zend_object_call_method_t          call_method;
    zend_object_get_constructor_t      get_constructor;
    zend_object_get_class_entry_t      get_class_entry;
    zend_object_get_class_name_t       get_class_name;
    zend_object_compare_t              compare_objects;
    zend_object_cast_t                 cast_object;
    zend_object_count_elements_t       count_elements;
    zend_object_get_debug_info_t       get_debug_info;
    zend_object_get_closure_t          get_closure;
} zend_object_handlers;

Except where indicated, the arguments are guaranteed not to be null pointers.

add_ref

void (*add_ref)(zval *object TSRMLS_DC)

del_ref

void (*del_ref)(zval *object TSRMLS_DC)

clone_obj

zend_object_value (*clone_obj)(zval *object TSRMLS_DC)

read_property

zval *(*read_property)(zval *object, zval *member, int type TSRMLS_DC)
/* var status for backpatching */
#define BP_VAR_R          0  /* read */
#define BP_VAR_W          1  /* write */
#define BP_VAR_RW         2  /* read/write */
#define BP_VAR_IS         3  /* check for existence */
#define BP_VAR_NA         4  /* if not applicable; unused? */
#define BP_VAR_FUNC_ARG   5  /* function argument */
#define BP_VAR_UNSET      6  /* unset */
zval *read_property_empty_implementation(zval *object, zval *member, int type TSRMLS_DC)
{
	/* maybe raise an error/exception here */
	return EG(uninitialized_zval_ptr);
}
  1. For pre- and post-increments and -decrements on properties, there is a pair of read_property/write_property calls; the read_property has type BP_VAR_R. For the same operations on dimensions, there is only read_dimension call with BP_VAR_RW type; the operation may succeed only if a reference (Z_ISREF) or proxy object are returned. Note that compound assignments on dimensions (e.g. $obj['index'] += 1) are handled with a pair of operations, just like properties.
  2. If read_property returns a zval with refcount 1 not belonging to a reference set in the context of a write-like operation (see discussion of the type argument above), this zval will be turned into a reference. In the case of read_dimension, a notice would be emitted and a reference set with the left part of the assignment and the dimension would not be built. Note that this special “turn into ref” case would not work if the returned zval had a higher refcount. Consider the following implementations:
zval *z;
zval *read_property(zval *object, zval *offset, int type TSRMLS_DC)
{
    return z;
}
void write_property(zval *object, zval *offset, zval *value TSRMLS_DC)
{
    z = value;
    /* Z_SET_ISREF_P(z); -- if uncommented, both would work*/
    zval_add_ref(&value);
}

Then, these two scripts would have different results:

$obj['prop'] = "hhh";
$a = &$obj['prop'];
$a = "bbb";
echo $obj['prop']; //echoes "bbb"
$str = "hhh";
$obj['prop'] = $str;
$a = &$obj['prop'];
$a = "bbb";
echo $obj['prop']; //echoes "hhh"

write_property

void (*write_property)(zval *object, zval *member, zval *value TSRMLS_DC)

read_dimension

zval *(*read_dimension)(zval *object, zval *offset, int type TSRMLS_DC)

write_dimension

void (*write_dimension)(zval *object, zval *offset, zval *value TSRMLS_DC)

get_property_ptr_ptr

zval **(*get_property_ptr_ptr)(zval *object, zval *member TSRMLS_DC)

get

zval* (*get)(zval *object TSRMLS_DC)
$a = &$obj->prop;
$a++;
$a = 6;

set

void (*set)(zval **object, zval *value TSRMLS_DC)

has_property

int (*has_property)(zval *object, zval *member, int has_set_exists TSRMLS_DC)

unset_property

void (*unset_property)(zval *object, zval *member TSRMLS_DC)

has_dimension

int (*has_dimension)(zval *object, zval *member, int check_empty TSRMLS_DC)

unset_dimension

void (*unset_dimension)(zval *object, zval *offset TSRMLS_DC)

get_properties

HashTable *(*get_properties)(zval *object TSRMLS_DC)

get_method

zend_function *(*get_method)(zval **object_ptr, char *method, int method_len TSRMLS_DC)

call_method

int (*call_method)(char *method, INTERNAL_FUNCTION_PARAMETERS)

get_constructor

zend_function *(*get_constructor)(zval *object TSRMLS_DC)

get_class_entry

zend_class_entry *(*get_class_entry)(const zval *object TSRMLS_DC)

get_class_name

int (*get_class_name)(const zval *object, char **class_name, zend_uint *class_name_len, int parent TSRMLS_DC)

compare_objects

int (*compare)(zval *object1, zval *object2 TSRMLS_DC)

cast_object

int (*cast)(zval *readobj, zval *retval, int type TSRMLS_DC)

count_elements

int (*count_elements)(zval *object, long *count TSRMLS_DC)

get_debug_info

HashTable *(*zend_object_get_debug_info_t)(zval *object, int *is_temp TSRMLS_DC)

get_closure

int (*get_closure)(zval *obj, zend_class_entry **ce_ptr, zend_function **fptr_ptr, zval **zobj_ptr TSRMLS_DC)

The zend_class_entry

TODO

Default handlers

TODO

PHP internal class declaration

Let's assume all the internal functions and custom object handlers are written. A PHP class declaration can then be divided in these tasks (some may be omitted):

We'll cover this items and then address the question of how to properly define a class so that it can be extended in userspace.

zend_function_entry array

The zend_function_entry structure contains the name of the method, a pointer to the (native) function that implements it, arginfo (describing arginfo structures is out of the scope of this text), and some flags for the method. The array is traditionally declared as a static global variable. Its purpose is to group and qualify the functions so that they can be converted to zend_function structures.

The array is terminated with a zeroed structure. Several macros exist for declaring the zend_function_entry structures. The most important are:

PHP_ME(classname, name, arg_info, flags)
PHP_MALIAS(classname, name, alias, arg_info, flags)
PHP_NAMED_ME(zend_name, name, arg_info, flags)
PHP_ME_MAPPING(name, func_name, arg_types, flags)

The standard way to declare a method is to use PHP_ME. It takes, in this order:

The macro PHP_ME can be used when the method implementation was declared in a standard way, i.e., with PHP_METHOD(classname, name).

The bitmask is built with the ZEND_ACC family of macros. Let's see the relevant part of the family. Some are used only for classes or properties, not methods:

#define ZEND_ACC_STATIC                     0x01     /* fn_flags, zend_property_info.flags */
#define ZEND_ACC_ABSTRACT                   0x02     /* fn_flags */
#define ZEND_ACC_FINAL                      0x04     /* fn_flags */
#define ZEND_ACC_IMPLEMENTED_ABSTRACT       0x08     /* fn_flags */
#define ZEND_ACC_IMPLICIT_ABSTRACT_CLASS    0x10     /* ce_flags */
#define ZEND_ACC_EXPLICIT_ABSTRACT_CLASS    0x20     /* ce_flags */
#define ZEND_ACC_FINAL_CLASS                0x40     /* ce_flags */
#define ZEND_ACC_INTERFACE                  0x80     /* ce_flags */
#define ZEND_ACC_INTERACTIVE                0x10     /* fn_flags */
#define ZEND_ACC_PUBLIC                     0x100    /* fn_flags, zend_property_info.flags */
#define ZEND_ACC_PROTECTED                  0x200    /* fn_flags, zend_property_info.flags */
#define ZEND_ACC_PRIVATE                    0x400    /* fn_flags, zend_property_info.flags */
#define ZEND_ACC_PPP_MASK \
    (ZEND_ACC_PUBLIC | ZEND_ACC_PROTECTED | ZEND_ACC_PRIVATE)
#define ZEND_ACC_CHANGED                    0x800    /* fn_flags, zend_property_info.flags */
#define ZEND_ACC_IMPLICIT_PUBLIC            0x1000   /* zend_property_info.flags; unused (1) */
#define ZEND_ACC_CTOR                       0x2000   /* fn_flags */
#define ZEND_ACC_DTOR                       0x4000   /* fn_flags */
#define ZEND_ACC_CLONE                      0x8000   /* fn_flags */
#define ZEND_ACC_ALLOW_STATIC               0x10000  /* fn_flags */
#define ZEND_ACC_SHADOW                     0x20000  /* fn_flags */
#define ZEND_ACC_DEPRECATED                 0x40000  /* fn_flags */
#define ZEND_ACC_CLOSURE                    0x100000 /* fn_flags */
#define ZEND_ACC_CALL_VIA_HANDLER           0x200000 /* fn_flags */
 
/* (1) ZEND_ACC_IMPLICIT_PUBLIC is unused since zend_do_declare_implicit_property is ifdef'd out */

These apply to methods:

These also apply to methods, but you needn't include them in your function entries:

The macro PHP_MALIAS(classname, name, alias, arg_info, flags) allows you to declare a method with a name and expose another name in user space, e.g.:

PHP_METHOD(myclass, mymethod) { ... }
...
PHP_MALIAS(myclass, myMethodInUserspace, mymethod, NULL, 0),
...

The macro PHP_NAMED_ME(zend_name, name, arg_info, flags) goes lower, you can specify the actual name of the C function you implemented, e.g.:

ZEND_NAMED_FUNCTION(my_arbitrary_name) { ... }
/* this resolves to my_arbitrary_name(INTERNAL_FUNCTION_PARAMETERS) { } */
...
PHP_NAMED_ME(my_arbitrary_name, myMethodInUserspace, NULL, 0),
...

Finally, PHP_ME_MAPPING(name, func_name, arg_types, flags) is usually to expose methods that also have a non-OOP interface, e.g.:

PHP_FUNCTION(ext_func) {
    zval *this;
 
    /* Use zend_parse_method_parameters to parse parameters in double interface implementations */
    if (zend_parse_method_parameters(ZEND_NUM_ARGS() TSRMLS_CC, getThis(), "O"
            &this, ext_class_ce_ptr) == FAILURE) {
        return;
    }
 
    /* alternative with plain zend_parse_parameters */
    /* this = getThis();
     * if (this == NULL) {
     *     if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "O",
     *             &this, ext_class_ce_ptr) == FAILURE) {
     *         return;
     * }
     * else if (zend_parse_parameters_none() == FAILURE) {
     *     return;
     * }
     */
}
...
static zend_function_entry ext_functions[] = {
    PHP_FE(ext_func, NULL),
    ...
    {NULL, NULL, NULL, 0, 0}
}
...
static zend_function_entry ext_class_methods[] = {
    PHP_ME_MAPPING(myMethodName, ext_func,	NULL, 0),
    ...
    {NULL, NULL, NULL, 0, 0}
}

Namespaced names can be built using ZEND_NS_NAME(namespace, name). Don't prefix the namespace with \.

Setup the Handlers Table

When one is implementing internal PHP classes, it is almost always undesirable to replace all of the standard handlers. The usual procedure for overriding the handlers follows these steps:

The handler table should not be initialized with initializer lists. Otherwise, when new handlers are added to the language, they will take the value NULL instead of the default value.

Example:

static zend_object_handlers myclass_handlers;
...
static HashTable *myclass_object_debug_info(zval *object, int *is_temp TSRMLS_DC)
{
    ...
}
static int myclass_object_compare_objects(zval *object1, zval *object2 TSRMLS_DC)
{
    ...
}
static zend_object_value myclass_object_clone(zval *object TSRMLS_DC)
{
    ...
}
...
ZEND_MODULE_STARTUP_D(myext)
{
    ...
    memcpy(&myclass_handlers, zend_get_std_object_handlers(),
        sizeof myclass_handlers);
 
    myclass_handlers.get_debug_info  = myclass_object_debug_info;
    myclass_handlers.compare_objects = myclass_object_compare_objects;
    myclass_handlers.clone_obj       = myclass_object_clone;
    ...
}

Beware of the clone_obj handler

Most likely, the default clone_obj handler will have to be replaced because it assumes the create_object class entry handler is not replaced and directly allocates a standard zend_object data structure. Therefore, worse than the added custom fields to the object structure not being initialized (because the clone handler only knows how to so a shadow copy of the standard properties), these fields will not even exist because only a plain zend_object is allocated.

You may also choose not to support the clone operation by setting the clone_obj handler to NULL.

Initialization of the Class Entry

Before registering the class, it's necessary to define and initialize a class entry structure. This structure is temporary -- upon class registration, a new class entry structure is allocated. The initialization is done with one of these macros:

INIT_CLASS_ENTRY(class_container, class_name, functions)
INIT_NS_CLASS_ENTRY(class_container, ns, class_name, functions)
INIT_CLASS_ENTRY_EX(class_container, class_name, class_name_len, functions)
INIT_OVERLOADED_CLASS_ENTRY(class_container, class_name, functions, handle_fcall,
    handle_propget, handle_propset)
INIT_OVERLOADED_NS_CLASS_ENTRY(class_container, ns, class_name, functions,
    handle_fcall, handle_propget, handle_propset)
INIT_OVERLOADED_CLASS_ENTRY_EX(class_container, class_name, class_name_len,
    functions, handle_fcall, handle_propget, handle_propset, handle_propunset,
    handle_propisset)
INIT_OVERLOADED_NS_CLASS_ENTRY_EX(class_container, ns, class_name, functions,
    handle_fcall, handle_propget, handle_propset, handle_propunset,
    handle_propisset)

This is the meaning of the parameters:

Example:

static zend_function_entry myclass_functions[] = {
    PHP_ME(myclass, myMethod, NULL, 0),
    ...
    {NULL, NULL, NULL, 0, 0}
}
...
ZEND_MODULE_STARTUP_D(myext)
{
    zend_class_entry ce;
    ...
 
    INIT_CLASS_ENTRY(ce, "MyClass", myclass_functions);
    ...
}

Do not further modify the class entry. Other modifications should be made in the class entry returned upon class registration. In particular, setting the class flags (e.g. final) at this point will not work.

Class Registration

The registration step serves two purposes:

  1. Automates the definition of certain aspects of the class definition (the zend_class_entry structure)
  2. Exposes the class to userspace.

The following functions/macros are available:

/* Functions */
zend_class_entry *zend_register_internal_class(zend_class_entry *class_entry TSRMLS_DC)
zend_class_entry *zend_register_internal_class_ex(zend_class_entry *class_entry,
    zend_class_entry *parent_ce, char *parent_name TSRMLS_DC)
zend_class_entry *zend_register_internal_interface(
    zend_class_entry *orig_class_entry TSRMLS_DC)
int zend_register_class_alias_ex(const char *name, int name_len,
    zend_class_entry *ce TSRMLS_DC)
/* Macros (they expand to zend_register_class_alias_ex, so return an int) */
zend_register_class_alias(name, ce)
zend_register_ns_class_alias(ns, name, ce)

The functions/macros with “alias” in their name only expose the class to userspace; they do not change the class entry in any way. In general, these should be used only if the class entry pointed to by the argument was previously created with a zend_register_* function.

The function zend_register_internal_class_ex should be used when defining a subclass. If parent_ce is given, the corresponding class will be used as the parent. If it is NULL and parent_name is not NULL, the given superclass name will be resolved. If both are NULL, it will behave like zend_register_internal_class.

The zend_register_internal_* classes execute these steps:

  1. Allocate a new class entry structure.
  2. Copy (in a shallow fashion) the passed class entry into the allocated one.
  3. Set its type to ZEND_INTERNAL_CLASS.
  4. Initializes the new class entry structure by allocating and initializing its hash tables and resetting a few “scalar” fields (the magic methods set in the original class entry through INIT_OVERLOADED_CLASS_ENTRY_EX are not replaced).
  5. Set the flags of the class entry to none or to ZEND_ACC_INTERFACE (according to the function called).
  6. Convert the zend_function_entry structures into zend_function's of type ZEND_INTERNAL_FUNCTION. These functions are added to the call entry function table. If it finds methods that match the name of magic methods, the corresponding class entry fields are set.
  7. If a parent is given, execute operations related to inheritance, e.g. copying inherited functions from the parent.

After class registration, the original zend_class_entry variable should not be used anymore.

After registration, it's also possible to retrieve the zend_class_entry variable through the class name. This is done with zend_lookup_class:

int zend_lookup_class(const char *name, int name_length, zend_class_entry ***ce TSRMLS_DC);

Notice the triple (not double) indirection. In pratice, most extensions opt to use a global variable (and even export it for other extensions through the macro ZEND_API) so as to avoid the performance penalty associated with the function call/hash table lookup.

Properties and Constants

In the module startup, after the PHP class is registered, it is time to add constants and properties.

For constants, the Zend API exposes the following functions:

int zend_declare_class_constant(zend_class_entry *ce, const char *name,
    size_t name_length, zval *value TSRMLS_DC)
int zend_declare_class_constant_null(zend_class_entry *ce, const char *name,
    size_t name_length TSRMLS_DC)
int zend_declare_class_constant_long(zend_class_entry *ce, const char *name,
    size_t name_length, long value TSRMLS_DC)
int zend_declare_class_constant_bool(zend_class_entry *ce, const char *name,
    size_t name_length, zend_bool value TSRMLS_DC)
int zend_declare_class_constant_double(zend_class_entry *ce, const char *name,
    size_t name_length, double value TSRMLS_DC)
int zend_declare_class_constant_stringl(zend_class_entry *ce, const char *name,
    size_t name_length, const char *value, size_t value_length TSRMLS_DC)
int zend_declare_class_constant_string(zend_class_entry *ce, const char *name,
    size_t name_length, const char *value TSRMLS_DC)

These all return SUCCESS or FAILURE. They are all straightforward to use, with the exception of zend_declare_class_constant. The passed zval should be allocated with ALLOC_PERMANENT_ZVAL (and then initialized and the intended value set).

For properties, the following functions are available:

int zend_declare_property(zend_class_entry *ce, char *name, int name_length,
    zval *property, int access_type TSRMLS_DC)
int zend_declare_property_ex(zend_class_entry *ce, const char *name,
    int name_length, zval *property, int access_type, char *doc_comment,
    int doc_comment_len TSRMLS_DC)
int zend_declare_property_null(zend_class_entry *ce, char *name,
    int name_length, int access_type TSRMLS_DC)
int zend_declare_property_bool(zend_class_entry *ce, char *name,
    int name_length, long value, int access_type TSRMLS_DC);
int zend_declare_property_long(zend_class_entry *ce, char *name,
    int name_length, long value, int access_type TSRMLS_DC);
int zend_declare_property_double(zend_class_entry *ce, char *name,
    int name_length, double value, int access_type TSRMLS_DC);
int zend_declare_property_string(zend_class_entry *ce, char *name,
    int name_length, char *value, int access_type TSRMLS_DC);
int zend_declare_property_stringl(zend_class_entry *ce, char *name,
     int name_length, char *value, int value_len, int access_type TSRMLS_DC)

These are analogous to their zend_declare_class_constant* counterparts, with the following differences:

The access type flags are taken from the ZEND_ACC_* family. See under the ''zend_function_entry'' array section.

These apply to properties and can be set by the extension programmer:

These are used internally and should not be passed to the functions above:

Note that zend_declare_property(_ex) also require a zval allocated with ALLOC_PERMANENT_ZVAL.

Note also that interfaces cannot have properties and access level cannot be decreased in subclasses.

Other class definition tweaks

The class entry structure can be changed in other ways after registration.

See also iterators and serialization callbacks.

Create object handler

Almost all internal classes will want to replace the class entry's create_object handler in order to be able to store arbitrary data in the object's data structure. See the section Data allocation and initialization for more on this.

Class flags

Class flags use the ZEND_ACC_* macro family. See under the ''zend_function_entry'' array section. At this point, the class may already have the flag ZEND_ACC_INTERFACE if you called zend_register_internal_interface.

These can be set after class registration:

These are automatically set by the engine and should not be set by the programmer:

Implement interfaces

A class may declare it implements one or more interfaces by calling the function:

void zend_class_implements(zend_class_entry *class_entry TSRMLS_DC,
    int num_interfaces, ...)

The ellipses represents one or more zend_class_entry * variables that point to the class entries of the interfaces to be implemented.

Designing subclassable classes

Designing internal classes so that they can be extended on userspace is simple. The subclass will have the same class entry create_object handler, not the default one which sets the standard object handlers and allocates a plain zend_object. Therefore, if the internal class has a different handler table or its storage is a different data structure, that will not be a problem.

The only thing about which one must be careful is constructors. Subclasses may define a new constructor that does not call the parent constructor. If the internal class relies on the constructor to set a consistent internal state, it can be changed in the following alternative ways:

Often, the internal constructor requires several arguments to be passed. The constructor for the subclass may be defined so that it takes less or different arguments. This is clearly a problem that cannot be handled by the first approach. The second one can at least fail if not enough arguments are given or bad arguments are given, but even that's not a very good idea, because the arguments, even being apparently correct, may have different semantics. In sum, if the construction requires arguments, there is no good solution except requiring the super constructor to be called. This can be accomplished this way:

static zend_object_handlers object_handlers;
static zend_class_entry *ce_ptr;
static zend_function constr_wrapper_fun;
 
typedef struct test_object {
    zend_object std;
    zend_bool constructed; /* TestClass constructor was called? */
    /* more properties follow */
    ...
} test_object;
 
static zend_object_value ce_create_object(zend_class_entry *class_type TSRMLS_DC)
{
    zend_object_value zov;
    test_object       *tobj;
 
    tobj = emalloc(sizeof *tobj);
    zend_object_std_init((zend_object *) tobj, class_type TSRMLS_CC);
    tobj->constructed = 0;
 
#if PHP_VERSION_ID < 50399
    zend_hash_copy(tobj->std.properties, &(class_type->default_properties),
        (copy_ctor_func_t) zval_add_ref, NULL, sizeof(zval*));
#else
    object_properties_init(&tobj->std, class_type);
#endif
 
    zov.handle = zend_objects_store_put(tobj,
        (zend_objects_store_dtor_t) zend_objects_destroy_object,
        (zend_objects_free_object_storage_t) zend_objects_free_object_storage,
        NULL TSRMLS_CC);
    zov.handlers = &object_handlers;
    return zov;
}
 
PHP_METHOD(testclass, __construct)
{
    zval *this = getThis();
 
    test_object *tobj = zend_object_store_get_object(this TSRMLS_CC);
    assert(tobj != NULL);
 
    tobj->constructed = (zend_bool) 1;
 
    ...
 
    /* if there's an error that leaves the object in an invalid state and
     * you have to throw an exception, also destroy the $this reference.
     * The reason is that the exception may be caught in the constructor
     * of the child class that's calling this constructor. */
 
    if (bad_thing_happened()) {
        /* Destroying only the $this reference will cause the object to leak;
         * it will be destroyed on request shutdown, but you can prevent that
         * by also destroying the object with:
         * zval_dtor(this);
         * But beware this will call both the destroy_object and the
         * free_object handlers. If you want only the second to be called,
         * you can call zend_object_store_ctor_failed() before */
        ZVAL_NULL(this);
    }
 
}
 
static zend_function *get_constructor(zval *object TSRMLS_DC)
{
    /* Could always return constr_wrapper_fun, but it's uncessary to call the
     * wrapper if instantiating the superclass */
    if (Z_OBJCE_P(object) == ce_ptr)
        return zend_get_std_object_handlers()->
            get_constructor(object TSRMLS_CC);
    else
        return &constr_wrapper_fun;
}
 
static void construction_wrapper(INTERNAL_FUNCTION_PARAMETERS) {
    zval *this = getThis();
    test_object *tobj;
    zend_class_entry *this_ce;
    zend_function *zf;
    zend_fcall_info fci = {0};
    zend_fcall_info_cache fci_cache = {0};
    zval *retval_ptr = NULL;
    unsigned i;
 
    tobj = zend_object_store_get_object(this TSRMLS_CC);
    zf = zend_get_std_object_handlers()->get_constructor(this TSRMLS_CC);
    this_ce = Z_OBJCE_P(this);
 
    fci.size = sizeof(fci);
    fci.function_table = &this_ce->function_table;
    fci.object_ptr = this;
    /* fci.function_name = ; not necessary to bother */
    fci.retval_ptr_ptr = &retval_ptr;
    fci.param_count = ZEND_NUM_ARGS();
    fci.params = emalloc(fci.param_count * sizeof *fci.params);
    /* Or use _zend_get_parameters_array_ex instead of loop: */
    for (i = 0; i < fci.param_count; i++) {
        fci.params[i] = (zval **) (zend_vm_stack_top(TSRMLS_C) - 1 -
            (fci.param_count - i));
    }
    fci.object_ptr = this;
    fci.no_separation = 0;
 
    fci_cache.initialized = 1;
    fci_cache.called_scope = EG(current_execute_data)->called_scope;
    fci_cache.calling_scope = EG(current_execute_data)->current_scope;
    fci_cache.function_handler = zf;
    fci_cache.object_ptr = this;
 
    zend_call_function(&fci, &fci_cache TSRMLS_CC);
    if (!EG(exception) && tobj->constructed == 0)
        zend_throw_exception(NULL, "parent::__construct() must be called in "
            "the constructor.", 0 TSRMLS_CC);
    efree(fci.params);
    zval_ptr_dtor(&retval_ptr);
}
 
static zend_function_entry ext_class_methods[] = {
    PHP_ME(testclass, __construct, 0, ZEND_ACC_PUBLIC)
    ...
    {NULL, NULL, NULL, 0, 0}
}
 
ZEND_MODULE_STARTUP_D(testext)
{
    zend_class_entry ce;
 
    memcpy(&object_handlers, zend_get_std_object_handlers(),
        sizeof object_handlers);
    object_handlers.get_constructor = get_constructor;
    object_handlers.clone_obj       = NULL;
 
    INIT_CLASS_ENTRY(ce, "TestClass", ext_class_methods);
    ce_ptr = zend_register_internal_class(&ce TSRMLS_CC);
    ce_ptr->create_object = ce_create_object;
 
    constr_wrapper_fun.type = ZEND_INTERNAL_FUNCTION;
    constr_wrapper_fun.common.function_name = "internal_construction_wrapper";
    constr_wrapper_fun.common.scope = ce_ptr;
    constr_wrapper_fun.common.fn_flags = ZEND_ACC_PROTECTED;
    constr_wrapper_fun.common.prototype = NULL;
    constr_wrapper_fun.common.required_num_args = 0;
    constr_wrapper_fun.common.arg_info = NULL;
#if PHP_VERSION_ID < 50399
    /* moved to common.fn_flags with rev 303381 */
    constr_wrapper_fun.common.pass_rest_by_reference = 0;
    constr_wrapper_fun.common.return_reference = 0;
#endif
    constr_wrapper_fun.internal_function.handler = construction_wrapper;
    constr_wrapper_fun.internal_function.module = EG(current_module);
 
    return SUCCESS;
}

Another option is to check on every internal method call whether the native structure has been properly initialized by the native constructor. Since most instance methods will need to fetch the object, this is a good opportunity to do the check. For instance, the cairo extension does this:

static inline cairo_surface_object* cairo_surface_object_get(zval *zobj TSRMLS_DC)
{
    cairo_surface_object *pobj = zend_object_store_get_object(zobj TSRMLS_CC);
    if (pobj->surface == NULL) {
        php_error(E_ERROR, "Internal surface object missing in %s wrapper, you must call parent::__construct in extended classes", Z_OBJCE_P(zobj)->name);
    }
    return pobj;
}

This has two disadvantages relatively to the previous method:

  1. It defers the check until an instance method is called, instead of immediately when the problem occurs (when the user-land constructor doesn't call the parent native constructor).
  2. The check is made on every method call, instead of only once.

However, this is by far a more popular approach, since it's simple and portable -- it uses only stable parts of the API.

A variant of this strategy is to centralize the object state validation in the get_method handler and either throw a fatal error or return a method that throws an exception from the handler in case the object state is invalid. This makes it easier to fix current code without replacing the calls to zend_object_store_get_object in every method implementation.

Finally, another option, certainly less complex but more limiting, is to make the superclass constructor final.

Iterators

TODO

Serialization callbacks

TODO

Object creation and destruction

Object creation involves these steps:

  1. Allocate and initialize the underlying data structure
  2. Store the object
  3. Build a reference to the object
  4. (optional) Call the constructor

Calling the constructor is uncommon internally because there are easier ways to initialize the object (calling a zend_function is verbose). The initialization steps that are common to all the objects of a given type can be done in step 1. The initialization of a particular instance (which e.g. depends on some other data, the kind of data that would be passed to a constructor) can be done in a separate auxiliary C function. Every time an object is instantiated internally, the programmer must also call this function to do instance-specific initialization. A constructor is still necessary to properly support the new operator, but this strategy does not imply duplication of code -- the internal implementation of the constructor may rely on the same auxiliary function.

Data allocation and initialization

In general, this part is completely domain dependent. The programmer may allocate and initialize an object however he wants.

However, zend standard objects (those with a class entry) rely on the class entry's create_object handler. Typically, these have a data structure whose pointer can be passed to functions that expect zend_object*. Hence, the typical class entry create_object handler will look like test_create_object in the example below:

typedef struct test_object {
    zend_object std;
    /* more properties follow */
    ...
} test_object;
 
static zend_object_handlers object_handlers;
 
static zend_object_value test_create_object(zend_class_entry *class_type TSRMLS_DC)
{
    zend_object_value zov;
    test_object       *tobj;
 
    tobj = emalloc(sizeof *tobj);
    zend_object_std_init((zend_object *) tobj, class_type TSRMLS_CC);
 
#if PHP_VERSION_ID < 50399
    zend_hash_copy(tobj->std.properties, &(class_type->default_properties),
        (copy_ctor_func_t) zval_add_ref, NULL, sizeof(zval*));
#else
    object_properties_init((zend_object*)tobj, class_type);
#endif
 
    /* The destroy and free callbacks should be replaced if necessary */
    zov.handle = zend_objects_store_put(tobj,
        (zend_objects_store_dtor_t) zend_objects_destroy_object,
        (zend_objects_free_object_storage_t) zend_objects_free_object_storage,
        NULL TSRMLS_CC);
 
    /* other specific stuff */
    ...
 
    zov.handlers = &object_handlers;
    return zov;
}
 
ZEND_MODULE_STARTUP_D(testext)
{
    zend_class_entry ce;
    zend_class_entry *ce_ptr;
 
    memcpy(&object_handlers, zend_get_std_object_handlers(),
        sizeof object_handlers);
    /* replace necessary handlers */
    ...
 
    INIT_CLASS_ENTRY(ce, "TestClass", ext_class_methods);
    ce_ptr = zend_register_internal_class(&ce TSRMLS_CC);
    ce_ptr->create_object = ce_create_object;
 
    /* Other startup stuff */
    ...
 
    return SUCCESS;
}

The create_object handler can also be NULL, in which case the general operations listed in `test_create_object` are executed except a vanilla zend_object structure is initialized (instead of a test_object).

Object storage

Objects are accessed through their references, the only thing linking the references to object instances is a integer (the object handle). This handle is a key that allows access to the object data structure. How this is done depends entirely on the type of the object.

Of particular relevance, are, of course, zend standard objects. These are stored in the 'objects store'. The zend objects API exposes these functions:

/* Storage */
typedef void (*zend_objects_store_dtor_t)(void *object,
    zend_object_handle handle TSRMLS_DC);
typedef void (*zend_objects_free_object_storage_t)(void *object TSRMLS_DC);
typedef void (*zend_objects_store_clone_t)(void *object,
    void **object_clone TSRMLS_DC);
zend_object_handle zend_objects_store_put(void *object,
    zend_objects_store_dtor_t dtor, zend_objects_free_object_storage_t storage,
    zend_objects_store_clone_t clone TSRMLS_DC);
 
/* Retrieval */
void *zend_object_store_get_object(const zval *object TSRMLS_DC);
void *zend_object_store_get_object_by_handle(zend_object_handle handle TSRMLS_DC);
 
/* refcount related */
void zend_objects_store_add_ref(zval *object TSRMLS_DC);
void zend_objects_store_del_ref(zval *object TSRMLS_DC);
void zend_objects_store_add_ref_by_handle(zend_object_handle handle TSRMLS_DC);
oid zend_objects_store_del_ref_by_handle_ex(zend_object_handle handle,
    const zend_object_handlers *handlers TSRMLS_DC);
void zend_objects_store_del_ref_by_handle(zend_object_handle handle TSRMLS_DC);
zend_uint zend_objects_store_get_refcount(zval *object TSRMLS_DC);
 
/* Misc */
zend_object_value zend_objects_store_clone_obj(zval *object TSRMLS_DC);
/* zend_object_store_set_object:
 * It is ONLY valid to call this function from within the constructor of an
 * overloaded object.  Its purpose is to set the object pointer for the object
 * when you can't possibly know its value until you have parsed the arguments
 * from the constructor function.  You MUST NOT use this function for any other
 * weird games, or call it at any other time after the object is constructed.
 * (This is rarely used)
 */
void zend_object_store_set_object(zval *zobject, void *object TSRMLS_DC);
 
/* Called when the constructor was terminated by an exception. Prevents the
 * "destroy object" store callback from being called */
void zend_object_store_ctor_failed(zval *zobject TSRMLS_DC);
 
/* Used to destroy all the objects in the store */
void zend_objects_store_free_object_storage(zend_objects_store *objects TSRMLS_DC);

The objects store can actually store any type of data structures; the data structure doesn't have to be an extension of zend_object. The header file zend_objects.h provides some functions to deal exclusively with zend standard objects:

/* To be used in the create_object class entry handler to initialize the
 * zend_object structure */
void zend_object_std_init(zend_object *object, zend_class_entry *ce TSRMLS_DC);
 
/* Despite the name, this is actually related to object freeing. It frees all
 * the memory used by the inner structures of zend_object */
void zend_object_std_dtor(zend_object *object TSRMLS_DC);
 
/* The default implementation of the create_object handler */
zend_object_value zend_objects_new(zend_object **object,
    zend_class_entry *class_type TSRMLS_DC);
 
/* The default implementation of the free object store callback. Calls
 * the PHP destructor, if any. */
void zend_objects_destroy_object(zend_object *object,
    zend_object_handle handle TSRMLS_DC);
 
/* Alias of zend_object_store_get_object, except it returns a zend_object
 * pointer instead of void* */
zend_object *zend_objects_get_address(const zval *object TSRMLS_DC);
 
/* Copies the properties of the old_object and calls the class entry
 * clone handler. Used in the implementation of zend_objects_clone_obj
 * In PHP > 5.3, it also initializes the properties before. */
void zend_objects_clone_members(zend_object *new_object,
    zend_object_value new_obj_val, zend_object *old_object,
    zend_object_handle handle TSRMLS_DC);
 
/* Allocates a new object with zend_objects_new and clones the members.
 * It's the default implementation of the clone object handler. The fact
 * it uses zend_objects_new means you almost certainly will want to replace
 * the clone object handler when implementing internal classes. */
zend_object_value zend_objects_clone_obj(zval *object TSRMLS_DC);
 
/* default implementation of the free storage store callback.
 * Calls zend_object_std_dtor and then frees the object itself */
void zend_objects_free_object_storage(zend_object *object TSRMLS_DC);

The function zend_objects_store_put adds an object to the store. This is the function that must be called during the creation of the object, as exemplified in the listing of the section before. All of the three last arguments may be NULL.

After the call to zend_objects_store_put, the object will have reference count = 1 in the store.

Object reference creation

If we're creating a zend standard object, the create_object handler already returned a zend_object_value. The creation of an object reference zval is handled automatically by the new operator.

To instantiate new objects internally, the following macros are available:

int object_init(zval *arg);
int object_init_ex(zval *arg, zend_class_entry *ce);
 
/* This function requires 'properties' to contain all props declared in the
 * class and all props being public. If only a subset is given or the class
 * has protected members then you need to merge the properties separately by
 * calling zend_merge_properties(). */
int object_and_properties_init(zval *arg, zend_class_entry *ce,
     HashTable *properties);
 
/* This function should be called after the constructor has been called
 * because it may call __set from the uninitialized object otherwise. */
void zend_merge_properties(zval *obj, HashTable *properties,
    int destroy_ht TSRMLS_DC);

These all take an allocated and initialized (INIT_ZVAL) or partially initialized (INIT_PZVAL) zval pointer. object_init is not particularly useful, since it will instantiate a stdClass object. object_and_properties_init also allows efficient initialization of the object properties, but it has the limitations indicated in the comments. If all instances are to be initialized with the same property values, the default property values, defined when the class is registered, should be used instead.

Flow of the construction/destruction process

This is an overview of the process of object construction for zend standard object derivatives.

For internal instantiations:

  1. Allocate and (partially) initialize a zval *.
  2. Call object_init_ex. A pointer to the class entry should be available.
    1. Call the class entry's create_object handler.
      1. Allocate the object structure. This structure's first field should be a zend_object.
      2. Call zend_object_std_init to initialize the zend_object part of the object data.
      3. Copy the default properties from class entry into the properties hash table of the new object. In PHP >= 5.3.99, object_properties_init should be called instead because non-dynamic properties are stored in C arrays instead of the properties hash table (though the hash table is still used when it's requested or when there are dynamic properties).
      4. Call zend_objects_store_put, passing a custom “destroy object” callback which does cleanup specific to properly constructed objects and a custom “destroy object” callback that frees all the memory and other resources taken by the object (which is always called).
      5. Assign the return value of zend_objects_store_put to the zend_object_value that is to be returned.
      6. Set the field handlers of zend_object_value that's to be returned to the appropriate object handlers table.
    2. Set the zval type to IS_OBJECT and the value to that returned by the create_object handler.
  3. Do post-creation initialization on the new objected (the construction phase), typically through an auxiliary function.

For instantiations with new:

  1. Call the class entry's create_object handler.
    1. (see above)
    2. ...
  2. Call the PHP constructor, if any. Typically, the internal implementation of the constructor delegates the construction task to the same auxiliary function referred to in the last step of the list before.

Stored objects should not be destroyed explicitly; in fact, the store API doesn't even expose a function to destroy a particular object. Instead, the destruction should be managed through the refcount. When the reference count hits 0, the store will call the object “destruct” store handler (if the object construction didn't fail) and the “free object” handler and remove the entry from its table. See also the add_ref and del_ref handlers.