This is a bottom-top approach to Zend 2 objects.
Despite the wording, this document is not a specification, it results from analyzing the PHP implementation. Since it attempts to extract general rules from specific code snippets, it may contain wrong inferences. If you find errors, correct them (requires permission to edit the wiki) or send an e-mail to glopes ~at~ nebm.ist.utl.pt.
===== Definitions =====
We'll deal with three separate entities here: **references**, **objects** and **classes**. //Classes// represent a type and define the behavior of all the //objects// (the instances of that class) of that type. Each object will typically have one or more //references// to it. These are abstract concepts; in terms of implementation in PHP, references are mapped to object zvals, and the two terms are here used interchangeably.
Internally, the word //object// is overloaded to also mean "references" -- to avoid confusion, I shall not use the term //object// to refer to references, reserving it to the objects (either the concept or the storage of their data in memory). Additionally, the term //type// may mean //having a certain handler table//. Two classes may have different behavior (different methods, etc.) and share the same handler table. Finally, the term //reference//, when applied to a zval, may also mean the zval is part of a reference set (''Z_ISREF'') -- the ambiguity should be cleared in the context.
===== The object zval =====
A zval that is an object reference will have the type ''IS_OBJECT''. In this case, its value, which can be retrieved with ''Z_OBJVAL'', will be of type ''zend_object_value'', which is defined like this:
typedef struct _zend_object_value {
zend_object_handle handle; /* retrieve with Z_OBJ_HANDLE(zval) */
zend_object_handlers *handlers; /* retrieve with Z_OBJ_HT(zval) */
} zend_object_value;
The field ''handle'' (which can be retrieved with the ''Z_OBJ_HANDLE'' macro family) identifies the object to which the reference refers among those of the same type; ''zend_object_handle'' is actually just an integer -- that is, an object is uniquely identified by an integer and a ''zend_object_handlers'' structure. Consequently, two references are identical in the sense of the ''==='' operator if and only if they share both the handlers and the handle.
The ''handlers'' field has another purpose besides identifying the referred object. The structure it points to (the handler table) defines, at a low level, the behavior of the objects of that type. A specific handler function can be accessed with ''Z_OBJ_HANDLER(zval, hf)''.
The lifecycle of the references is associated with that of the objects. We likely want the object to be destroyed once there are no more references to it. The handler table has two entries for that purpose: ''add_ref'' and ''del_ref''. The first should be called when a new reference for that object is created; the second when is deleted. Functions like ''zval_copy_ctor'', ''zval_ptr_dtor'' and ''zval_dtor'' take care of calling the handler. So there are two types of refcounts one should have in mind when dealing with objects -- those of references (i.e., the zvals) and those of the objects themselves.
===== The handler table =====
Let's now explore the members of the handler table, which define the behavior of a class at a low level. Once we introduce the zend standard object, we'll see what their default values are (TODO).
typedef struct _zend_object_handlers {
/* general object functions */
zend_object_add_ref_t add_ref;
zend_object_del_ref_t del_ref;
zend_object_clone_obj_t clone_obj;
/* individual object functions */
zend_object_read_property_t read_property;
zend_object_write_property_t write_property;
zend_object_read_dimension_t read_dimension;
zend_object_write_dimension_t write_dimension;
zend_object_get_property_ptr_ptr_t get_property_ptr_ptr;
zend_object_get_t get;
zend_object_set_t set;
zend_object_has_property_t has_property;
zend_object_unset_property_t unset_property;
zend_object_has_dimension_t has_dimension;
zend_object_unset_dimension_t unset_dimension;
zend_object_get_properties_t get_properties;
zend_object_get_method_t get_method;
zend_object_call_method_t call_method;
zend_object_get_constructor_t get_constructor;
zend_object_get_class_entry_t get_class_entry;
zend_object_get_class_name_t get_class_name;
zend_object_compare_t compare_objects;
zend_object_cast_t cast_object;
zend_object_count_elements_t count_elements;
zend_object_get_debug_info_t get_debug_info;
zend_object_get_closure_t get_closure;
} zend_object_handlers;
Except where indicated, the arguments are guaranteed not to be null pointers.
==== add_ref ====
void (*add_ref)(zval *object TSRMLS_DC)
* Called when a new zval referring to the object is created. Called by ''zval_copy_ctor''.
* It may also be called when there is a need to hold some other kind of reference to the object. For instance, some instance method of an object //a// of class //A// may create and return an object //b// of class //B// that depends on data of the object //a// that spawned it. In that case, a possible strategy is to add reference to the //a// when //b// is created and store in //b// the handle of //a// so that a reference can be deleted when //b// is destroyed. Alternatively, //b// may store (e.g. as a property) a zval object to //a//.
* Should not be ''NULL''.
==== del_ref ====
void (*del_ref)(zval *object TSRMLS_DC)
* Called when a zval referring to the object is destroyed. Called by ''zval_dtor'' and therefore by functions that call ''zval_dtor'' such as ''zval_ptr_dtor'' and the ''convert_to'' family.
* See also [[#add_ref|add_ref]].
* Should not be ''NULL''.
==== clone_obj ====
zend_object_value (*clone_obj)(zval *object TSRMLS_DC)
* Called when an object is to be cloned (associated with usage of the ''clone'' operator in user space).
* Should return a ''zend_object_value'' that refers to a newly created object that is equal to the object referred to the passed reference. The two objects should not be identical, i.e., ''=='' applied to the references should return ''true'' but ''==='' should return ''false''. The ''[[#compare_objects|compare_objects]]'' handlers must be the same since that is a requisite for ''=='' returning ''true'', but typically, as one would want the two objects to have identical behavior, they ought to share all the handlers.
* Zend standard object extensions can call ''zend_objects_clone_members'' to copy properties; this function also calls the class entry ''clone'' function (typically ''%%__clone%%'').
* The created object should be initialized as if it had one reference.
* May be ''NULL'' to forbid cloning.
==== read_property ====
zval *(*read_property)(zval *object, zval *member, int type TSRMLS_DC)
* Retrieves a property of an object as a pointer to a ''zval''; corresponds to ''$obj->prop'' in (mainly) a reading context in userspace.
* If the argument ''member'' is not of type ''IS_STRING'', you convert it (after deep copying!). You may use ''convert_to_string'' function for the conversion.
* The argument ''type'' is of the BP family, which consists of:
/* var status for backpatching */
#define BP_VAR_R 0 /* read */
#define BP_VAR_W 1 /* write */
#define BP_VAR_RW 2 /* read/write */
#define BP_VAR_IS 3 /* check for existence */
#define BP_VAR_NA 4 /* if not applicable; unused? */
#define BP_VAR_FUNC_ARG 5 /* function argument */
#define BP_VAR_UNSET 6 /* unset */
* The types ''BP_VAR_R'' and ''BP_VAR_IS'' are the most relevant here. Typically, this decides how chatty the implementation will be. Note that it's the ''[[#has_property|has_property]]'' that is called when checking the existence of the property; ''BP_VAR_IS'' is used when retrieving a property to check whether it has a (sub-)dimension or (sub-)property, as in ''empty($rarF->prop[7][8])''.
* However, ''BP_VAR_W'', ''BP_VAR_RW'' and ''BP_VAR_UNSET'' are also possible values if the ''[[#get_property_ptr_ptr|get_property_ptr_ptr]]'' handler is undefined or fails. These types are used in write-like operations wherein a (sub-)dimension or (sub-)property of the the property value is being targeted (e.g. ''$obj->prop[32] = $h'' or ''unset($obj->prop[32])''; the type ''BP_VAR_W'' may also appear when assigning or passing the property (or (sub-)property/dimension thereof) by reference. If these cases are to be supported, one should return either a reference (in the ''Z_ISREF'' sense) or a proxy object (see the ''[[#get|get]]'' handler for more information), otherwise one should warn the user -- for instance, the default handler emits a warning if returning for write a zval that is not of type ''IS_OBJECT'' and is not referenced anywhere else because the write would necessarily have no effect (an object of type ''IS_OBJECT'' is permitted because it may be a proxy object).
* The reference count of a returned zval which is not otherwise referenced by the extension or the engine's symbol table should be 0. Likewise, the reference count of a zval being returned that exists elsewhere should not be incremented by this handler (it might be, but only on the account of some side effect, for instance creating the property on the fly and storing it in a hash table for future retrieval, not owing to the call to the handler per se).
* Note this handler itself (and the other ones) has no notion of accessibility.
* Should return ''EG(uninitialized_zval_ptr)'' if the property is undefined.
* Should not be ''NULL'', even for classes that have no properties, though it's not strictly forbidden. An empty implementation can be:
zval *read_property_empty_implementation(zval *object, zval *member, int type TSRMLS_DC)
{
/* maybe raise an error/exception here */
return EG(uninitialized_zval_ptr);
}
* NOTE (as of PHP 5.3.2): You may think the behavior of ''read_property'' is the same as that of ''[[#read_dimension|read_dimension]]'' when ''[[#get_property_ptr_ptr|get_property_ptr_ptr]]'' is undefined/fails. There are, however, subtle differences.
- For pre- and post-increments and -decrements on properties, there is a pair of ''read_property''/''[[#write_property|write_property]]'' calls; the ''read_property'' has type ''BP_VAR_R''. For the same operations on dimensions, there is only ''[[#read_dimension|read_dimension]]'' call with ''BP_VAR_RW'' type; the operation may succeed only if a reference (''Z_ISREF'') or proxy object are returned. Note that compound assignments on dimensions (e.g. ''$obj['index'] += 1'') are handled with a pair of operations, just like properties.
- If ''read_property'' returns a zval with refcount 1 not belonging to a reference set in the context of a write-like operation (see discussion of the ''type'' argument above), this zval will be turned into a reference. In the case of ''[[#read_dimension|read_dimension]]'', a notice would be emitted and a reference set with the left part of the assignment and the dimension would not be built. Note that this special "turn into ref" case would not work if the returned zval had a higher refcount. Consider the following implementations:
zval *z;
zval *read_property(zval *object, zval *offset, int type TSRMLS_DC)
{
return z;
}
void write_property(zval *object, zval *offset, zval *value TSRMLS_DC)
{
z = value;
/* Z_SET_ISREF_P(z); -- if uncommented, both would work*/
zval_add_ref(&value);
}
Then, these two scripts would have different results:
$obj['prop'] = "hhh";
$a = &$obj['prop'];
$a = "bbb";
echo $obj['prop']; //echoes "bbb"
$str = "hhh";
$obj['prop'] = $str;
$a = &$obj['prop'];
$a = "bbb";
echo $obj['prop']; //echoes "hhh"
==== write_property ====
void (*write_property)(zval *object, zval *member, zval *value TSRMLS_DC)
* Writes the value of a property of an object; corresponds to ''$obj->prop'' in a writing context in userspace.
* If the ''member'' argument is not a string, it should be (deep) copied and converted into one.
* The calling function does not admit that the value will be accepted and stored. Hence, if the value is to be stored by the handler, its refcount should be incremented. No modifications to the value are allowed. If one is to modify the value before storing it, one must deep copy it before (i.e., include a call ''zval_copy_ctor''). The only exception is if the value's refcount is 0 -- in that case, one may modify it at will and, if one wants to copy the value zval into another zval, one can make a shallow copy.
* Separate zvals that are references (in the sense of ''Z_ISREF''). Note that this handler is not meant to deal with reference assignments such as ''$obj->prop = &$var'' or ''$var = &$obj->prop'', which are the correct ways of making property values part of a reference set (the engine will call ''[[#get_property_ptr_ptr|get_property_ptr_ptr]]'' or, failing that, ''[[#read_property|read_property]]'').
* Should not be ''NULL'', even for classes that have no modifiable properties, though it's not strictly forbidden. An empty implementation will do nothing except maybe raise an error/exception.
==== read_dimension ====
zval *(*read_dimension)(zval *object, zval *offset, int type TSRMLS_DC)
* This is similar to ''[[#read_property|read_property]]'', except it's called in response to attempts to treat the object as an array, as in ''$obj['key']'' in (mainly) a reading context.
* You may not modify (except transiently) the ''offset'' argument.
* The argument ''offset'' can be a C ''NULL'' (when ''$obj[]'') is used. Despite the name, ''offset'' may be of any type of zval -- if it is an object reference with a ''[[#get|get]]'' handler, you may want to call it and use that result as an offset instead.
* The remarks made in ''[[#read_property|read_property]]'' with respect to the ''type'' argument also apply.
* Since there's no analogous to ''[[#get_property_ptr_ptr|get_property_ptr_ptr]]'', you ought to return a reference (in the sense of ''Z_ISREF'') or a proxy object (see the ''[[#get|get]]'' handler) in write-like contexts (types ''BP_VAR_W'', ''BP_VAR_RW'' and ''BP_VAR_UNSET''), though it's not mandatory. If ''read_dimension'' is being called in a write-like context such as in ''$val =& $obj['prop']'', and you return neither a reference nor an object, the engine emit a notice. Obviously, returning a reference is not enough for those operations to work correctly, it is necessary that modifying the returned zval actually has some effect. Note that assignments such as ''$obj['key'] = &$a'' are still not possible -- for that one would need the dimensions to actually be storable as zvals (which may or may not be the case) and two levels of indirection.
* The remarks made relative to the refcount of the returned value in ''[[#read_property|read_property]]'' also apply. Should return a C ''NULL'' in case the offset do not exist (the engine will then use ''error_zval'' or ''uninitialized_zval'' depending on whether it's a read or write context).
* May be ''NULL'' when the object is not to be treated as an array.
==== write_dimension ====
void (*write_dimension)(zval *object, zval *offset, zval *value TSRMLS_DC)
* This is similar to ''write_property'', except it's called in response to attempts to treat the object as an array, as in ''$obj['key']'' in a writing context.
* You may not modify (except transiently) the ''offset'' argument.
* The argument ''offset'' can be a C ''NULL'' (when ''$obj[]'') is used. It can be any type of zval -- if it is an object reference, you may call its ''[[#get|get]]'' handler and use the result as the offset instead.
* The same remarks made in ''write_property'' apply -- should increment the refcount of value if storing it and should not change the value zval in most circumstances (a deep copy should be made first).
* If a reference is passed, it should be separated. While you may think you may want to store a reference so that the value of the dimension may be changed indirectly (through another symbol), this is not the way (what you can do is ''$obj['key'] = $a; $a = &$obj['key']'' -- the first assignment is handled by ''write_dimension'' the second by ''[[#read_dimension|read_dimension]]'' returning a reference or a proxy object).
* May be ''NULL'' when the object is not to be treated as an array.
==== get_property_ptr_ptr ====
zval **(*get_property_ptr_ptr)(zval *object, zval *member TSRMLS_DC)
* Returns a property with double indirection so that the caller may directly replace the zval. This may be for efficiency reasons (a read/write pair of calls would otherwise be needed and would be unnecessary if the underlying storage of the properties are in fact zvals) or because the nature of the operation requires double indirection -- namely, send by reference and assign by reference of object properties.
* If the ''member'' argument is not a string, it should be converted.
* The rules about the refcount of the returned value given for the ''read_property'' handler also apply here, though it doesn't make much sense to return a pointer to memory that is not held by the extension or the engine here (the case with refcount 0).
* If the property does not exist, the handler may try to create it on the fly. If one initializes these properties to the same value, one should consider using a single initialization zval, for instance EG(uninitialized_zval) -- allocate a ''zval*'', give it the value of ''EG(uninitialized_zval_ptr)'', increment its refcount and return the address the created pointer (see the behavior of ''zend_std_get_property_ptr_ptr()'').
* Returning a C ''NULL'' signifies failure and causes a fallback to the ''[[#read_property|read_property]]'' or ''[[#write_property|write_property]]'' handlers.
* Prefer an empty implementation (always returning a C ''NULL'') to a ''NULL'' in the handler table. A ''NULL'' in the handler table ought to have the same effect, but there are bugs.
==== get ====
zval* (*get)(zval *object TSRMLS_DC)
* This handler is called when attempting to treat the object as a scalar value in a read context. That situation arises when using the pre- and post-increment and -decrement operators and compound assignment operators (e.g. ''$obj++'' and ''$obj += 6'') -- it is then followed by a call to ''[[#set|set]]'' -- and as a fallback for type conversions in case there is no ''[[#cast_object|cast_object]]'' handler (as a preferred method relatively to ''[[#cast_object|cast_object]]'' in a few circumstances such as when [[http://www.php.net/define|defining]] a constant or comparing objects to scalars).
* A common application for ''get''/''[[#set|set]]'' handlers is when implementing **proxy objects**. These are used when the underlying storage of the properties or dimensions are not zvals (e.g. when the PHP object is an interface to an object in another language). In that case, it would be impossible to have those properties/dimensions part of a reference set and operate in this manner:
$a = &$obj->prop;
$a++;
$a = 6;
* Proxy objects make this possible. If one returns from ''[[#read_property|read_property]]'' or ''[[#read_dimension|read_dimension]]'' an ''IS_OBJECT'' zval with ''[[#get|get]]'' and ''[[#set|set]]'' handlers, the read in the post-increment would be handled by the ''[[#get|get]]'' handler and the writes in the post-increment and the assignment would be handled by the ''[[#set|set]]'' handler.
* Should return a newly allocated zval with refcount 0.
* One should not expect calls to ''get'' being followed by calls to ''set'' in the context of compound assignments and increments/decrements. In ''$obj['prop']++'', ''[[#read_dimension|read_dimension]]'' would be called with ''offset'' ''"prop"'' and ''type'' ''BP_VAR_R''. Suppose it then returns a proxy element. The ''get'' handler of this proxy handler would be called in order to generate a zval; the zval would be separated if not a reference and incremented; then not ''[[#set|set]]'' but instead ''[[#write_dimension|write_dimension]]'' would be called so as to write the result.
* Should not return ''NULL''; if implemented must return a valid zval.
* May be ''NULL'', in that case the object cannot be treated as a scalar in the mentioned circumstances.
==== set ====
void (*set)(zval **object, zval *value TSRMLS_DC)
* This handler is called when attempting to make a (non-reference) assignment to an object zval, including when using the pre- and post-increment and -decrement operators and compound assignment operators (e.g. ''$obj++'' and ''$obj += 6'') -- in this case, preceded by a call to ''[[#get|get]]''.
* Note the double indirection on the ''object'' argument. The pointed zval may be changed or completely replaced by changing the value of ''*object''. Remember to adjust the refcounts and consider whether the zval is part of a reference set.
* A common application for ''[[#get|get]]''/''set'' handlers is when implementing **proxy objects**. See ''[[#get|get]]'' for more information.
* The remarks made in ''[[#write_property|write_property]]'' about the ''value'' argument also apply.
* See also the description of ''[[#get|get]]''.
* May be ''NULL'', in that case the object cannot be treated as a scalar in the mentioned circumstances.
==== has_property ====
int (*has_property)(zval *object, zval *member, int has_set_exists TSRMLS_DC)
* This handler is called whenever the engine needs to determine whether a property exists.
* If the ''member'' argument is not a string, it should be (deep) copied and converted.
* The parameter ''has_set_exists'' can take the following values:
* 0 -- check whether the property exists and is not ''NULL''; used by the ''[[http://www.php.net/isset|isset]]'' operator
* 1 -- check whether the property exists and is true; semantics of ''[[http://www.php.net/empty|empty]]''; one may want to use ''zend_is_true''
* 2 -- check whether the property exists, even if it is ''NULL''; used by the ''[[http://www.php.net/property_exists|property_exists]]'' function. Note that this parameter slightly differs from ''[[#has_dimension|has_dimension]]'''s ''check_empty'' in that the latter cannot take the value ''2''.
* An empty implementation ought not to emit an error/exception (or have any other side effects) even if the type does not admit properties and especially if ''has_set_exists'' is 2, so that ''[[http://www.php.net/property_exists|property_exists]]'' can be quiet.
* Read also the note on the usage of the ''BP_VAR_IS'' type for the ''[[#read_property|read_property]]'' handler.
* Should return either ''0'' (doesn't have the property) or ''1'' (has the property).
* Should not be ''NULL'', though it's not strictly forbidden by the engine.
==== unset_property ====
void (*unset_property)(zval *object, zval *member TSRMLS_DC)
* Called in order to unset an object property.
* If the ''member'' argument is not a string, it should be (deep) copied and converted.
* If ''member'' refers to a property that does not exist, this function should fail silently (no notices!). However, if the object type does not support properties, an error/exception may be emitted.
* Should not be ''NULL'', though it's not strictly forbidden by the engine.
==== has_dimension ====
int (*has_dimension)(zval *object, zval *member, int check_empty TSRMLS_DC)
* Determines whether an object has a certain dimension.
* The argument ''check_empty'' has the same meaning as ''[[#has_property|has_property]]'''s ''has_set_exists'' parameter, with the exception that it cannot take the value ''2''.
* This handler should have no side effects.
* Read also the note on the usage of the ''BP_VAR_IS'' type for the ''[[#read_property|read_property]]'' handler.
* Should return either ''0'' (doesn't have the dimension) or ''1'' (has the dimension).
* May be ''NULL'' when the object is not to be treated as an array.
==== unset_dimension ====
void (*unset_dimension)(zval *object, zval *offset TSRMLS_DC)
* Called in order to unset an object's dimension.
* The argument ''offset'' may be of any type; if it is an object with a ''[[#get|get]]'' handler, one may want to call it and use instead the result as the offset.
* It should fail silently (without notices) in the case the offset refer to a dimension that does not exist.
* May be ''NULL'' if the object is not to be treated as an array.
==== get_properties ====
HashTable *(*get_properties)(zval *object TSRMLS_DC)
* Retrieves the object as a hash table. This is usually a hash table containing the object instance properties and some code may (incorrectly) use this hash table to retrieve object properties. This is function is used, even in preference to ''[[#cast_object|cast_object]]'', in explicit conversions (i.e. ''convert_to_array'' family and ''convert_to_explicit_type'', not ''convert_object_to_type'') to convert an object into an array.
* In practice, you may use this function return other data, for instance dimensions. Since several array functions (such as ''[[http://www.php.net/end|end]]'', ''[[http://www.php.net/prev|prev]]'', ''[[http://www.php.net/next|next]]'', ''[[http://www.php.net/reset|reset]]'', ''[[http://www.php.net/current|current]]'', ''[[http://www.php.net/key|key]]'', ''[[http://www.php.net/array_walk|array_walk]]'', ''[[http://www.php.net/array_walk_recursive|array_walk_recursive]]'' and ''[[http://www.php.net/array_key_exists|array_key_exists]]'') call this handler when an object is passed, it can be used to provide a more array-like experience of the object (together ''[[#read_dimension|read_dimension]]'', ''[[#write_dimension|write_dimension]]'', ''[[#has_dimension|has_dimension]]'', ''[[#unset_dimension|unset_dimension]]'' and ''[[#count_elements|count_elements]]'').
* The garbage collector uses this handler to reach the zvals that the object is holding. If the hash table is lazily generated (on the first call to the handler) and it hasn't been built yet (it's the first call), it may be appropriate to refuse to do so (and return ''NULL'') on calls by the garbage collector. This is especially true if such generation involves the creation of new zvals. The global ''GC_G(gc_active)'' tells whether the garbage collector is running.
* The ''zend_parse_parameters'' has the specifiers **H** and **A** which accept both arrays and objects.
* The ''Z_OBJPROP'' macro family are shortcuts to access this handler. They assume the handler exists! The ''Z_OBJDEBUG'' macro family fall back on this handler if ''[[#get_debug_info|get_debug_info]]'' doesn't exist.
* If the underlying storage of the hash table values are in fact zvals, you may return a hash table that stores the same ''zval *'' values. Depending on how the hash table is then exposed in userspace (whether reference sets are separated), this may allow indirect modification of the underlying storage. If those zvals are stored in a hash table, you can go further and return the hash table itself -- this will generally still not allow replacement/addition/deletion of the hash table's values in user space (e.g. turning an object into an array requires the hash table to be copied), yet may be faster and allow internal code to replace/add/delete entries directly to the hash table.
* The handler owns the hash table. Typically, this handler always returns the same hash table, which accompanies the life cycle of the object (is created when the object is created, etc.).
* The Zend engine does not forbid it to be ''NULL'', but several extensions (including the standard extension) assume it exists. The built-in function ''[[http://www.php.net/get_object_vars|get_object_vars]]'' assumes a Zend standard object if it exists. Prefer an implementation that returns an empty hash table.
* See also ''[[#get_debug_info|get_debug_info]]''.
==== get_method ====
zend_function *(*get_method)(zval **object_ptr, char *method, int method_len TSRMLS_DC)
* Called in order to fetch a method as ''zend_function''.
* Should return ''NULL'' if the method does not exist, otherwise should return a ''zend_function''. The rules for who owns the return value are as follows:
* If the type is ''ZEND_INTERNAL_FUNCTION'', then it's owned by the caller if and only if the subfield ''common.fn_flags'' has the flag ''ZEND_ACC_CALL_VIA_HANDLER''. However, note that if the caller then uses the return value to make a function call, it should not free it since it will already have been done.
* If the type is ''ZEND_OVERLOADED_FUNCTION'' or ''ZEND_OVERLOADED_FUNCTION_TEMPORARY'', then it's owned by the caller.
* For all other cases, the caller is not responsible for freeing the return.
* If the caller owns the result, and the type is not ''ZEND_OVERLOADED_FUNCTION'', it should also free (with ''efree'') the subfield ''common.function_name''.
* The argument ''object_ptr'' is given with double indirection. Altering ''*object_ptr'' allows one to change the //this// pointer passed to the method and the called scope into the class entry of the the written value. The refcount of the original value should be decreased, the new value's should be increased (or set one if created from scratch). (unconfirmed)
* One ought to convert the method name to lowercase to mimic the usual (half-assed) case insensitiveness of method names. See ''zend_str_tolower_copy''.
* May be ''NULL'' if the object is not to support method calls.
==== call_method ====
int (*call_method)(char *method, INTERNAL_FUNCTION_PARAMETERS)
* This method is called whenever the engine tries the call a function with type ''ZEND_OVERLOADED_FUNCTION'' or ''ZEND_OVERLOADED_FUNCTION_TEMPORARY''. It's conceptually related to the ''%%__call%%'' magic method.
* The ''method'' argument is a string that should identify the function to be called.
* This method is unique in that the first parameter is not the an object zval pointer. A zval pointer is included in the ''INTERNAL_FUNCTION_PARAMETERS'' and can be retrieved with ''getThis()''.
* May be ''NULL'', but in that case you should not return ''zend_functions'''s of type ''ZEND_OVERLOADED_FUNCTION'' or ''ZEND_OVERLOADED_FUNCTION_TEMPORARY'' from [[#get_method|get_method]] or [[#get_constructor|get_constructor]] (and neither should [[#get_closure|get_closure]] if ''object_ptr'' is filled with object zvals that do not have this handler).
==== get_constructor ====
zend_function *(*get_constructor)(zval *object TSRMLS_DC)
* This handler has the same semantics as [[#get_method|get_method]] and is called to retrieve a function that is to perform initialization operations on the object.
* May be ''NULL''.
==== get_class_entry ====
zend_class_entry *(*get_class_entry)(const zval *object TSRMLS_DC)
* Gives a pointer to a ''zend_class_entry''. This structure provides a scope for object operations and defines PHP classes. The default handlers defer to this structure for much of their behavior; additionally, much functionality, such as reflection and the ''instaceof'' operator is restricted to PHP classes.
* If implemented, all objects of the same class should have a ''get_class_entry'' handler returning the same value. Should not return ''NULL''.
* You may use the ''Z_OBJCE'' macro family for accessing the return of the handler. It resolves to ''zend_get_class_entry'', which either returns ''NULL'' and emits a fatal error if the handler does not exist or returns the (non-null for correct implementations) result of the this handler.
* The ''IS_ZEND_STD_OBJECT'' and ''HAS_CLASS_ENTRY'' macros (the last one should only be used if the zval is known to be an object reference) are a shortcut for determining whether a zval is of a type which has this handler implemented.
* Will never be ''NULL'' for Zend standard objects (and derivations thereof) and will be ''NULL'' for all other objects (by definition).
==== get_class_name ====
int (*get_class_name)(const zval *object, char **class_name, zend_uint *class_name_len, int parent TSRMLS_DC)
* Extracts a class name for display or reflection purposes. This name has no special meaning.
* If ''parent'' is ''0'', the name of the class of the passed object is being requested, otherwise it's the parent class of the passed object. The handler may return ''FAILURE'' if there is no parent class or it doesn't know.
* On success, ''*class_name'' should be set with a pointer to a null-terminated string allocated with non-persistent storage (''emalloc'') and ''*name_len'' should be set with the length of ''*class_name'' (excluding terminator).
* Should return ''SUCCESS'' or ''FAILURE''. If it fails, ''*class_name'' and ''*class_name_len'' should retain their original values or be set to ''NULL''/''0''.
* May be ''NULL''. Note that, as of PHP 5.3.2, some portions of the standard extension expect the handler to exist and not fail when ''parent'' is ''0''.
==== compare_objects ====
int (*compare)(zval *object1, zval *object2 TSRMLS_DC)
* Compares two objects. Used for the operators ''=='', ''!='', ''<'', ''>'', ''<='' and ''>=''.
* The implementations should follow these rules -- for any objects //a//, //b// and //c// that share the same compare handler:
- compare(//a//, //a//) = 0
- sign(compare(//a//, //b//)) = -sign(compare(//b//, //a//)) where sign(//x//) is 1 if //x// is positive, -1 if it's negative and 0 if it's 0.
- if compare(//a//, //b//) = 0 and compare(//b//, //c//) = 0, then compare(//a, c//) = 0
* This means one must implement a total order.
* One may find an equivalent set of conditions on the documentation of Java's [[http://java.sun.com/javase/6/docs/api/java/lang/Comparable.html#compareTo(T)|java.lang.Comparable.compareTo(T)]].
* The handler may return only ''-1'', ''0'' and ''1'' (//a// < //b//, //a// = //b// and //a// > //b//). If not, one is encouraged to implement the handler so that compare(//a//, //b//) > compare(//a//, //c//), then compare(//b//, //c//) < 0.
* Should not be ''NULL''; a possible simple implementation is just returning the result of an object handle subtraction.
==== cast_object ====
int (*cast)(zval *readobj, zval *retval, int type TSRMLS_DC)
* Called when an object is to be converted into another type.
* If not defined or if the call fails, the engine will use fallback strategies that include calling ''[[#get|get]]'', or using a number of default conversion strategies (the strategies used for the standard objects).
* The ''readobj'' contains the object to be converted; it should not be modified in any way.
* The handler may assume ''readobj'' and ''retval'' have different values.
* The ''retval'' is an allocated zval on which the handler should write the result. It should first be initialized (''INIT_ZVAL'') ignoring its previous value.
* In case of error, ''FAILURE'' should be returned. If the ''retval'' was already initialized and is holding further resources, it should be destroyed (as in ''zval_dtor'') by the handler; if it was not initialized, it should be left untouched. In case of success, ''SUCCESS'' should be returned.
* See also ''[[#get|get]]'' and ''[[#get_properties|get_properties]]''.
* May be ''NULL''.
==== count_elements ====
int (*count_elements)(zval *object, long *count TSRMLS_DC)
* Called to determine the count of some countable object. A count is a non-negative value.
* Objects that have array-like access will probably want to implement this, so that they can behave more like an array.
* Note that this handler is not used by the engine itself, only by ''[[http://www.php.net/count|count]]'' and other extensions.
* This handler writes a non-negative number in ''*count'' and returns ''SUCCESS'' if the passed object is countable; returns ''FAILURE'' otherwise.
* May be ''NULL'' if the type does not support the notion of "countable"; the effect would be the same of having an implementation always returning ''FAILURE''.
==== get_debug_info ====
HashTable *(*zend_object_get_debug_info_t)(zval *object, int *is_temp TSRMLS_DC)
* Returns a hash table with arbitrary key/value pairs for debugging purposes.
* The ''Z_OBJDEBUG'' macro is a shortcut to access this handler; it may be used if one knows the handler not to be ''NULL''.
* The ''is_temp'' argument cannot be ''NULL''. The value ''1'' should be written in ''*is_temp'' if the returned hash table is owned by the caller (and hence the caller must destroy and free it with ''zend_hash_destroy'' and ''efree''); otherwise ''0'' should be written.
* Should not return ''NULL''.
* Avoid having this handler set to ''NULL''; although the engine does not require its existence, the standard extension does (as of PHP 5.3.2).
==== get_closure ====
int (*get_closure)(zval *obj, zend_class_entry **ce_ptr, zend_function **fptr_ptr, zval **zobj_ptr TSRMLS_DC)
* This handler allows the object to be used as a function.
* The argument ''ce_ptr'' will not be ''NULL'' and should be filled with the scope of the function or ''NULL''. The written value will be used as the called scope. The calling scope will be taken from the scope associated with the returned ''zend_function''. Only under exceptional circumstances will it be used as a calling scope (maybe internal functions where the ''zend_function'' has no associated scope – unconfirmed).
* The argument ''ftpr_ptr'' should be populated with the desired function. The caller is not responsible for freeing it, so the structure should accompany the life cycle of the class or object. It's outside the scope of this document to describe this structure, see for example ''zend_register_functions'' for how to create internal functions.
* The argument ''zobj_ptr'' may be NULL; if it isn't, ''*zobj_ptr'' is to be filled with ''NULL'' or, in case the function is an instance method of a standard object stored in ''EG(objects_store)'', the object it refers to. One should not increment the refcount of that object only because one is passing it to the caller.
* See also ''[[#get_method|get_method]]''.
* Returns ''SUCCESS'' or ''FAILURE''.
* May be ''NULL''.
===== The zend_class_entry =====
TODO
===== Default handlers =====
TODO
===== PHP internal class declaration =====
Let's assume all the internal functions and custom object handlers are written. A PHP class declaration can then be divided in these tasks (some may be omitted):
* Definition of ''zend_function_entry'' array, which groups the internal functions that were defined.
* Definition and initialization of the handlers table.
* Initialization of the class entry.
* Registration of the class.
* Declaration of static and instance properties and constants.
* Other tweaks of the class entry.
We'll cover this items and then address the question of how to properly define a class so that it can be extended in userspace.
==== zend_function_entry array ====
The ''zend_function_entry'' structure contains the name of the method, a pointer to the (native) function that implements it, arginfo (describing arginfo structures is out of the scope of this text), and some flags for the method. The array is traditionally declared as a static global variable. Its purpose is to group and qualify the functions so that they can be converted to ''zend_function'' structures.
The array is terminated with a zeroed structure. Several macros exist for declaring the ''zend_function_entry'' structures. The most important are:
PHP_ME(classname, name, arg_info, flags)
PHP_MALIAS(classname, name, alias, arg_info, flags)
PHP_NAMED_ME(zend_name, name, arg_info, flags)
PHP_ME_MAPPING(name, func_name, arg_types, flags)
The standard way to declare a method is to use ''PHP_ME''. It takes, in this order:
* The name of the class. This is an arbitrary name, not reflected in userspace, that is consistent with how the method's internal implementations were declared.
* The name of the method. This is how the method will be called in userspace AND how it was declared with ''PHP_METHOD''. If you need those two to differ, you should use ''PHP_MALIAS''.
* An arginfo structure.
* A bitmask defining the accessibility of the method.
The macro ''PHP_ME'' can be used when the method implementation was declared in a standard way, i.e., with ''PHP_METHOD(classname, name)''.
The bitmask is built with the ''ZEND_ACC'' family of macros. Let's see the relevant part of the family. Some are used only for classes or properties, not methods:
#define ZEND_ACC_STATIC 0x01 /* fn_flags, zend_property_info.flags */
#define ZEND_ACC_ABSTRACT 0x02 /* fn_flags */
#define ZEND_ACC_FINAL 0x04 /* fn_flags */
#define ZEND_ACC_IMPLEMENTED_ABSTRACT 0x08 /* fn_flags */
#define ZEND_ACC_IMPLICIT_ABSTRACT_CLASS 0x10 /* ce_flags */
#define ZEND_ACC_EXPLICIT_ABSTRACT_CLASS 0x20 /* ce_flags */
#define ZEND_ACC_FINAL_CLASS 0x40 /* ce_flags */
#define ZEND_ACC_INTERFACE 0x80 /* ce_flags */
#define ZEND_ACC_INTERACTIVE 0x10 /* fn_flags */
#define ZEND_ACC_PUBLIC 0x100 /* fn_flags, zend_property_info.flags */
#define ZEND_ACC_PROTECTED 0x200 /* fn_flags, zend_property_info.flags */
#define ZEND_ACC_PRIVATE 0x400 /* fn_flags, zend_property_info.flags */
#define ZEND_ACC_PPP_MASK \
(ZEND_ACC_PUBLIC | ZEND_ACC_PROTECTED | ZEND_ACC_PRIVATE)
#define ZEND_ACC_CHANGED 0x800 /* fn_flags, zend_property_info.flags */
#define ZEND_ACC_IMPLICIT_PUBLIC 0x1000 /* zend_property_info.flags; unused (1) */
#define ZEND_ACC_CTOR 0x2000 /* fn_flags */
#define ZEND_ACC_DTOR 0x4000 /* fn_flags */
#define ZEND_ACC_CLONE 0x8000 /* fn_flags */
#define ZEND_ACC_ALLOW_STATIC 0x10000 /* fn_flags */
#define ZEND_ACC_SHADOW 0x20000 /* fn_flags */
#define ZEND_ACC_DEPRECATED 0x40000 /* fn_flags */
#define ZEND_ACC_CLOSURE 0x100000 /* fn_flags */
#define ZEND_ACC_CALL_VIA_HANDLER 0x200000 /* fn_flags */
/* (1) ZEND_ACC_IMPLICIT_PUBLIC is unused since zend_do_declare_implicit_property is ifdef'd out */
These apply to methods:
* ''ZEND_ACC_PUBLIC'', ''ZEND_ACC_PROTECTED'', ''ZEND_ACC_PRIVATE'' - exactly one of these these flags '''must''' be included.
* ''ZEND_ACC_STATIC'', ''ZEND_ACC_ABSTRACT'' and ''ZEND_ACC_FINAL'' - define a method as static, abstract or final, respectively.
* ''ZEND_ACC_ALLOW_STATIC'' - allows an instance method to be called statically; also allows an instance method to assume a ''$this'' from an incompatible context (see implementation for opcode ''INIT_STATIC_METHOD_CALL''). New code ought not to set this flag.
* ''ZEND_ACC_DEPRECATED'' - marks a method as deprecated.
These also apply to methods, but you needn't include them in your function entries:
* ''ZEND_ACC_IMPLEMENTED_ABSTRACT'' - used if the method is declared abstract somewhere up the hierarchy. Despite the name, the method may have no implementation -- if an abstract subclass does not implement an abstract method from the superclass, the subclass copy of the method will have this flag set. Do not set this; it will be set automatically.
* ''ZEND_ACC_CHANGED'' - used for methods of a subclass that had their visibility increased from protected to public when overridden. Do not set this; it will be set automatically.
* ''ZEND_ACC_CALL_VIA_HANDLER'' is applied to ''zend_function'' structures that are generated on-the-fly in response to calls to %%__call%%, %%__callstatic%%, or by the ''[[#get_method|get_method]]'' handler. This determines the memory freeing procedure. It also allows overriding pass-by-value semantics of functions with ''zend_call_function'' (used by ''call_user_func_array''). See also ''[[#get_method|get_method]]''.
* ''ZEND_ACC_CLONE'' marks a method as a clone method. This is automatically set for methods named ''%%__clone%%'', but appears to have no effect at the engine level (the standard clone handler looks for a method called ''%%__clone%%'', with no regard for this flag).
* ''ZEND_ACC_CTOR'', although commonly manually set in the arginfo, is set automatically for methods with the appropriate name (either old-style or new -style). Setting this manually a method that would not be selected as a constructor is an error.
* ''ZEND_ACC_DTOR'', is set automatically for methods with the name ''%%__destruct%%''. Beyond that, it has no effect at the engine level.
The macro ''PHP_MALIAS(classname, name, alias, arg_info, flags)'' allows you to declare a method with a name and expose another name in user space, e.g.:
PHP_METHOD(myclass, mymethod) { ... }
...
PHP_MALIAS(myclass, myMethodInUserspace, mymethod, NULL, 0),
...
The macro ''PHP_NAMED_ME(zend_name, name, arg_info, flags)'' goes lower, you can specify the actual name of the C function you implemented, e.g.:
ZEND_NAMED_FUNCTION(my_arbitrary_name) { ... }
/* this resolves to my_arbitrary_name(INTERNAL_FUNCTION_PARAMETERS) { } */
...
PHP_NAMED_ME(my_arbitrary_name, myMethodInUserspace, NULL, 0),
...
Finally, ''PHP_ME_MAPPING(name, func_name, arg_types, flags)'' is usually to expose methods that also have a non-OOP interface, e.g.:
PHP_FUNCTION(ext_func) {
zval *this;
/* Use zend_parse_method_parameters to parse parameters in double interface implementations */
if (zend_parse_method_parameters(ZEND_NUM_ARGS() TSRMLS_CC, getThis(), "O"
&this, ext_class_ce_ptr) == FAILURE) {
return;
}
/* alternative with plain zend_parse_parameters */
/* this = getThis();
* if (this == NULL) {
* if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "O",
* &this, ext_class_ce_ptr) == FAILURE) {
* return;
* }
* else if (zend_parse_parameters_none() == FAILURE) {
* return;
* }
*/
}
...
static zend_function_entry ext_functions[] = {
PHP_FE(ext_func, NULL),
...
{NULL, NULL, NULL, 0, 0}
}
...
static zend_function_entry ext_class_methods[] = {
PHP_ME_MAPPING(myMethodName, ext_func, NULL, 0),
...
{NULL, NULL, NULL, 0, 0}
}
Namespaced names can be built using ''ZEND_NS_NAME(namespace, name)''. Don't prefix the namespace with ''\''.
==== Setup the Handlers Table ====
When one is implementing internal PHP classes, it is almost always undesirable to replace all of the standard handlers. The usual procedure for overriding the handlers follows these steps:
* Implement the handlers you wish to override (read [[#The_handler_table|this section]] to understand their semantics).
* Define a global variable of type ''zend_object_handlers''. Mostly likely, the handler table will need not be referred to from other compilation units, so it can have file scope.
* On module startup, the standard handlers are copied into the defined ''zend_object_handlers'' variable.
* The fields corresponding to the handlers that are to be overridden are written with pointers to the custom handlers that were implemented.
The handler table should not be initialized with initializer lists. Otherwise, when new handlers are added to the language, they will take the value ''NULL'' instead of the default value.
Example:
static zend_object_handlers myclass_handlers;
...
static HashTable *myclass_object_debug_info(zval *object, int *is_temp TSRMLS_DC)
{
...
}
static int myclass_object_compare_objects(zval *object1, zval *object2 TSRMLS_DC)
{
...
}
static zend_object_value myclass_object_clone(zval *object TSRMLS_DC)
{
...
}
...
ZEND_MODULE_STARTUP_D(myext)
{
...
memcpy(&myclass_handlers, zend_get_std_object_handlers(),
sizeof myclass_handlers);
myclass_handlers.get_debug_info = myclass_object_debug_info;
myclass_handlers.compare_objects = myclass_object_compare_objects;
myclass_handlers.clone_obj = myclass_object_clone;
...
}
=== Beware of the clone_obj handler ===
Most likely, the default ''[[#clone_obj|clone_obj]]'' handler will have to be replaced because it assumes the ''create_object'' class entry handler is not replaced and directly allocates a standard ''zend_object'' data structure. Therefore, worse than the added custom fields to the object structure not being initialized (because the clone handler only knows how to so a shadow copy of the standard properties), these fields will not even exist because only a plain ''zend_object'' is allocated.
You may also choose not to support the clone operation by setting the ''clone_obj'' handler to ''NULL''.
==== Initialization of the Class Entry ====
Before registering the class, it's necessary to define and initialize a class entry structure. This structure is temporary -- upon class registration, a new class entry structure is allocated. The initialization is done with one of these macros:
INIT_CLASS_ENTRY(class_container, class_name, functions)
INIT_NS_CLASS_ENTRY(class_container, ns, class_name, functions)
INIT_CLASS_ENTRY_EX(class_container, class_name, class_name_len, functions)
INIT_OVERLOADED_CLASS_ENTRY(class_container, class_name, functions, handle_fcall,
handle_propget, handle_propset)
INIT_OVERLOADED_NS_CLASS_ENTRY(class_container, ns, class_name, functions,
handle_fcall, handle_propget, handle_propset)
INIT_OVERLOADED_CLASS_ENTRY_EX(class_container, class_name, class_name_len,
functions, handle_fcall, handle_propget, handle_propset, handle_propunset,
handle_propisset)
INIT_OVERLOADED_NS_CLASS_ENTRY_EX(class_container, ns, class_name, functions,
handle_fcall, handle_propget, handle_propset, handle_propunset,
handle_propisset)
This is the meaning of the parameters:
* ''class_container'' - The temporary ''zend_class_entry'' to initialize.
* ''class_name'' - The class name to expose in userspace (a string).
* ''functions'' - A ''zend_function_entry'' array terminated with an empty entry.
* ''class_name_len'' - The length of ''class_name'', excluding the terminator.
* ''ns'' - The namespace of the class. Don't prefix it with ''\''.
* ''handle_fcall'', ''handle_propget'', ''handle_propset'', ''handle_propunset'' and ''handle_propisset'' - these are ''zend_function'' pointers (or ''NULL'') that can be used to populate the respective fields in the ''zend_class_entry'' structure.
Example:
static zend_function_entry myclass_functions[] = {
PHP_ME(myclass, myMethod, NULL, 0),
...
{NULL, NULL, NULL, 0, 0}
}
...
ZEND_MODULE_STARTUP_D(myext)
{
zend_class_entry ce;
...
INIT_CLASS_ENTRY(ce, "MyClass", myclass_functions);
...
}
Do not further modify the class entry. Other modifications should be made in the class entry returned upon class registration. In particular, setting the class flags (e.g. final) at this point will not work.
==== Class Registration ====
The registration step serves two purposes:
- Automates the definition of certain aspects of the class definition (the ''zend_class_entry'' structure)
- Exposes the class to userspace.
The following functions/macros are available:
/* Functions */
zend_class_entry *zend_register_internal_class(zend_class_entry *class_entry TSRMLS_DC)
zend_class_entry *zend_register_internal_class_ex(zend_class_entry *class_entry,
zend_class_entry *parent_ce, char *parent_name TSRMLS_DC)
zend_class_entry *zend_register_internal_interface(
zend_class_entry *orig_class_entry TSRMLS_DC)
int zend_register_class_alias_ex(const char *name, int name_len,
zend_class_entry *ce TSRMLS_DC)
/* Macros (they expand to zend_register_class_alias_ex, so return an int) */
zend_register_class_alias(name, ce)
zend_register_ns_class_alias(ns, name, ce)
The functions/macros with "alias" in their name only expose the class to userspace; they do not change the class entry in any way. In general, these should be used only if the class entry pointed to by the argument was previously created with a ''zend_register_*'' function.
The function ''zend_register_internal_class_ex'' should be used when defining a subclass. If ''parent_ce'' is given, the corresponding class will be used as the parent. If it is ''NULL'' and ''parent_name'' is not ''NULL'', the given superclass name will be resolved. If both are ''NULL'', it will behave like ''zend_register_internal_class''.
The ''zend_register_internal_*'' classes execute these steps:
- Allocate a new class entry structure.
- Copy (in a shallow fashion) the passed class entry into the allocated one.
- Set its type to ''ZEND_INTERNAL_CLASS''.
- Initializes the new class entry structure by allocating and initializing its hash tables and resetting a few "scalar" fields (the magic methods set in the original class entry through ''INIT_OVERLOADED_CLASS_ENTRY_EX'' are not replaced).
- Set the flags of the class entry to none or to ''ZEND_ACC_INTERFACE'' (according to the function called).
- Convert the ''zend_function_entry'' structures into ''zend_function'''s of type ''ZEND_INTERNAL_FUNCTION''. These functions are added to the call entry function table. If it finds methods that match the name of magic methods, the corresponding class entry fields are set.
- If a parent is given, execute operations related to inheritance, e.g. copying inherited functions from the parent.
After class registration, the original ''zend_class_entry'' variable should not be used anymore.
After registration, it's also possible to retrieve the ''zend_class_entry'' variable through the class name. This is done with ''zend_lookup_class'':
int zend_lookup_class(const char *name, int name_length, zend_class_entry ***ce TSRMLS_DC);
Notice the triple (not double) indirection. In pratice, most extensions opt to use a global variable (and even export it for other extensions through the macro ''ZEND_API'') so as to avoid the performance penalty associated with the function call/hash table lookup.
==== Properties and Constants ====
In the module startup, after the PHP class is registered, it is time to add constants and properties.
For constants, the Zend API exposes the following functions:
int zend_declare_class_constant(zend_class_entry *ce, const char *name,
size_t name_length, zval *value TSRMLS_DC)
int zend_declare_class_constant_null(zend_class_entry *ce, const char *name,
size_t name_length TSRMLS_DC)
int zend_declare_class_constant_long(zend_class_entry *ce, const char *name,
size_t name_length, long value TSRMLS_DC)
int zend_declare_class_constant_bool(zend_class_entry *ce, const char *name,
size_t name_length, zend_bool value TSRMLS_DC)
int zend_declare_class_constant_double(zend_class_entry *ce, const char *name,
size_t name_length, double value TSRMLS_DC)
int zend_declare_class_constant_stringl(zend_class_entry *ce, const char *name,
size_t name_length, const char *value, size_t value_length TSRMLS_DC)
int zend_declare_class_constant_string(zend_class_entry *ce, const char *name,
size_t name_length, const char *value TSRMLS_DC)
These all return ''SUCCESS'' or ''FAILURE''. They are all straightforward to use, with the exception of ''zend_declare_class_constant''. The passed zval should be allocated with ''ALLOC_PERMANENT_ZVAL'' (and then initialized and the intended value set).
For properties, the following functions are available:
int zend_declare_property(zend_class_entry *ce, char *name, int name_length,
zval *property, int access_type TSRMLS_DC)
int zend_declare_property_ex(zend_class_entry *ce, const char *name,
int name_length, zval *property, int access_type, char *doc_comment,
int doc_comment_len TSRMLS_DC)
int zend_declare_property_null(zend_class_entry *ce, char *name,
int name_length, int access_type TSRMLS_DC)
int zend_declare_property_bool(zend_class_entry *ce, char *name,
int name_length, long value, int access_type TSRMLS_DC);
int zend_declare_property_long(zend_class_entry *ce, char *name,
int name_length, long value, int access_type TSRMLS_DC);
int zend_declare_property_double(zend_class_entry *ce, char *name,
int name_length, double value, int access_type TSRMLS_DC);
int zend_declare_property_string(zend_class_entry *ce, char *name,
int name_length, char *value, int access_type TSRMLS_DC);
int zend_declare_property_stringl(zend_class_entry *ce, char *name,
int name_length, char *value, int value_len, int access_type TSRMLS_DC)
These are analogous to their ''zend_declare_class_constant*'' counterparts, with the following differences:
* There is a ''zend_declare_property_ex'' that accepts a doc comment. The doc comment can be retrieved through reflection.
* All functions access an access type.
The access type flags are taken from the ''ZEND_ACC_*'' family. See under the [[#zend_function_entry array|''zend_function_entry'' array section]].
These apply to properties and can be set by the extension programmer:
* ''ZEND_ACC_STATIC'' - define a property as static.
* ''ZEND_ACC_PUBLIC'', ''ZEND_ACC_PROTECTED'', ''ZEND_ACC_PRIVATE'' - only one of these these flags can be included. If none is included, it will default to ''ZEND_ACC_PUBLIC''.
These are used internally and should not be passed to the functions above:
* ''ZEND_ACC_CHANGED'' - set in instance properties duplicated in the subclass properties where the correspondent superclass property a) has ''ZEND_ACC_CHANGED'', b) has ''ZEND_ACC_PRIVATE'', c) has ''ZEND_ACC_SHADOW''.
* ''ZEND_ACC_SHADOW'' - set in instance properties copied from the superclass that are not duplicated in the subclass and which have ''ZEND_ACC_PRIVATE'' or ''ZEND_ACC_SHADOW''.
* ''ZEND_ACC_IMPLICIT_PUBLIC'' - formerly (?) used for properties implicitly public (e.g. dynamic properties, i.e., undeclared instance properties).
Note that ''zend_declare_property(_ex)'' also require a zval allocated with ''ALLOC_PERMANENT_ZVAL''.
Note also that interfaces cannot have properties and access level cannot be decreased in subclasses.
==== Other class definition tweaks ====
The class entry structure can be changed in other ways after registration.
See also [[#Iterators|iterators]] and [[#Serialization callbacks|serialization callbacks]].
=== Create object handler ===
Almost all internal classes will want to replace the class entry's ''create_object'' handler in order to be able to store arbitrary data in the object's data structure. See the section [[#data allocation and initialization|Data allocation and initialization]] for more on this.
=== Class flags ===
Class flags use the ''ZEND_ACC_*'' macro family. See under the [[#zend_function_entry array section|''zend_function_entry'' array]] section. At this point, the class may already have the flag ''ZEND_ACC_INTERFACE'' if you called ''zend_register_internal_interface''.
These can be set after class registration:
* ''ZEND_ACC_FINAL_CLASS'', ''ZEND_ACC_EXPLICIT_ABSTRACT_CLASS'' - define a class as final or static. It's unnecessary to explicitly set ''ZEND_ACC_EXPLICIT_ABSTRACT_CLASS'' if the class has (or inherits) abstract methods.
* ''ZEND_ACC_PUBLIC'', ''ZEND_ACC_PROTECTED'', ''ZEND_ACC_PRIVATE'' - only one of these these flags can be included. If none is included, it will default to ''ZEND_ACC_PUBLIC''.
These are automatically set by the engine and should not be set by the programmer:
* ''ZEND_ACC_IMPLICIT_ABSTRACT_CLASS'' - set automatically for classes that have abstract methods. Interfaces may have it too. Internal functions are also automatically given ''ZEND_ACC_ABSTRACT_CLASS'' whenever an abstract method is found.
* ''ZEND_ACC_CLOSURE'' - used internally for objects that are closures.
* ''ZEND_ACC_IMPLEMENT_INTERFACES'' - the class implements one or more interfaces. See [[#Implement interfaces|below]].
=== Implement interfaces ===
A class may declare it implements one or more interfaces by calling the function:
void zend_class_implements(zend_class_entry *class_entry TSRMLS_DC,
int num_interfaces, ...)
The ellipses represents one or more ''zend_class_entry *'' variables that point to the class entries of the interfaces to be implemented.
==== Designing subclassable classes ====
Designing internal classes so that they can be extended on userspace is simple. The subclass will have the same class entry ''create_object'' handler, not the default one which sets the standard [[#The_handler_table|object handlers]] and allocates a plain ''zend_object''. Therefore, if the internal class has a different handler table or its storage is a different data structure, that will not be a problem.
The only thing about which one must be careful is constructors. Subclasses may define a new constructor that does not call the parent constructor. If the internal class relies on the constructor to set a consistent internal state, it can be changed in the following alternative ways:
* Moving the necessary initialization to the ''create_object'' class entry handler.
* Overriding the ''[[#get_constructor|get_constructor]]'' handler. It could, for example, be modified to always return a function that does the necessary initializations, calls the default ''get_constructor'' handler (''zend_std_get_constructor'') and then executes the returned constructor, if any.
Often, the internal constructor requires several arguments to be passed. The constructor for the subclass may be defined so that it takes less or different arguments. This is clearly a problem that cannot be handled by the first approach. The second one can at least fail if not enough arguments are given or bad arguments are given, but even that's not a very good idea, because the arguments, even being apparently correct, may have different semantics. In sum, if the construction requires arguments, there is no good solution except requiring the super constructor to be called. This can be accomplished this way:
static zend_object_handlers object_handlers;
static zend_class_entry *ce_ptr;
static zend_function constr_wrapper_fun;
typedef struct test_object {
zend_object std;
zend_bool constructed; /* TestClass constructor was called? */
/* more properties follow */
...
} test_object;
static zend_object_value ce_create_object(zend_class_entry *class_type TSRMLS_DC)
{
zend_object_value zov;
test_object *tobj;
tobj = emalloc(sizeof *tobj);
zend_object_std_init((zend_object *) tobj, class_type TSRMLS_CC);
tobj->constructed = 0;
#if PHP_VERSION_ID < 50399
zend_hash_copy(tobj->std.properties, &(class_type->default_properties),
(copy_ctor_func_t) zval_add_ref, NULL, sizeof(zval*));
#else
object_properties_init(&tobj->std, class_type);
#endif
zov.handle = zend_objects_store_put(tobj,
(zend_objects_store_dtor_t) zend_objects_destroy_object,
(zend_objects_free_object_storage_t) zend_objects_free_object_storage,
NULL TSRMLS_CC);
zov.handlers = &object_handlers;
return zov;
}
PHP_METHOD(testclass, __construct)
{
zval *this = getThis();
test_object *tobj = zend_object_store_get_object(this TSRMLS_CC);
assert(tobj != NULL);
tobj->constructed = (zend_bool) 1;
...
/* if there's an error that leaves the object in an invalid state and
* you have to throw an exception, also destroy the $this reference.
* The reason is that the exception may be caught in the constructor
* of the child class that's calling this constructor. */
if (bad_thing_happened()) {
/* Destroying only the $this reference will cause the object to leak;
* it will be destroyed on request shutdown, but you can prevent that
* by also destroying the object with:
* zval_dtor(this);
* But beware this will call both the destroy_object and the
* free_object handlers. If you want only the second to be called,
* you can call zend_object_store_ctor_failed() before */
ZVAL_NULL(this);
}
}
static zend_function *get_constructor(zval *object TSRMLS_DC)
{
/* Could always return constr_wrapper_fun, but it's uncessary to call the
* wrapper if instantiating the superclass */
if (Z_OBJCE_P(object) == ce_ptr)
return zend_get_std_object_handlers()->
get_constructor(object TSRMLS_CC);
else
return &constr_wrapper_fun;
}
static void construction_wrapper(INTERNAL_FUNCTION_PARAMETERS) {
zval *this = getThis();
test_object *tobj;
zend_class_entry *this_ce;
zend_function *zf;
zend_fcall_info fci = {0};
zend_fcall_info_cache fci_cache = {0};
zval *retval_ptr = NULL;
unsigned i;
tobj = zend_object_store_get_object(this TSRMLS_CC);
zf = zend_get_std_object_handlers()->get_constructor(this TSRMLS_CC);
this_ce = Z_OBJCE_P(this);
fci.size = sizeof(fci);
fci.function_table = &this_ce->function_table;
fci.object_ptr = this;
/* fci.function_name = ; not necessary to bother */
fci.retval_ptr_ptr = &retval_ptr;
fci.param_count = ZEND_NUM_ARGS();
fci.params = emalloc(fci.param_count * sizeof *fci.params);
/* Or use _zend_get_parameters_array_ex instead of loop: */
for (i = 0; i < fci.param_count; i++) {
fci.params[i] = (zval **) (zend_vm_stack_top(TSRMLS_C) - 1 -
(fci.param_count - i));
}
fci.object_ptr = this;
fci.no_separation = 0;
fci_cache.initialized = 1;
fci_cache.called_scope = EG(current_execute_data)->called_scope;
fci_cache.calling_scope = EG(current_execute_data)->current_scope;
fci_cache.function_handler = zf;
fci_cache.object_ptr = this;
zend_call_function(&fci, &fci_cache TSRMLS_CC);
if (!EG(exception) && tobj->constructed == 0)
zend_throw_exception(NULL, "parent::__construct() must be called in "
"the constructor.", 0 TSRMLS_CC);
efree(fci.params);
zval_ptr_dtor(&retval_ptr);
}
static zend_function_entry ext_class_methods[] = {
PHP_ME(testclass, __construct, 0, ZEND_ACC_PUBLIC)
...
{NULL, NULL, NULL, 0, 0}
}
ZEND_MODULE_STARTUP_D(testext)
{
zend_class_entry ce;
memcpy(&object_handlers, zend_get_std_object_handlers(),
sizeof object_handlers);
object_handlers.get_constructor = get_constructor;
object_handlers.clone_obj = NULL;
INIT_CLASS_ENTRY(ce, "TestClass", ext_class_methods);
ce_ptr = zend_register_internal_class(&ce TSRMLS_CC);
ce_ptr->create_object = ce_create_object;
constr_wrapper_fun.type = ZEND_INTERNAL_FUNCTION;
constr_wrapper_fun.common.function_name = "internal_construction_wrapper";
constr_wrapper_fun.common.scope = ce_ptr;
constr_wrapper_fun.common.fn_flags = ZEND_ACC_PROTECTED;
constr_wrapper_fun.common.prototype = NULL;
constr_wrapper_fun.common.required_num_args = 0;
constr_wrapper_fun.common.arg_info = NULL;
#if PHP_VERSION_ID < 50399
/* moved to common.fn_flags with rev 303381 */
constr_wrapper_fun.common.pass_rest_by_reference = 0;
constr_wrapper_fun.common.return_reference = 0;
#endif
constr_wrapper_fun.internal_function.handler = construction_wrapper;
constr_wrapper_fun.internal_function.module = EG(current_module);
return SUCCESS;
}
Another option is to check on every internal method call whether the native structure has been properly initialized by the native constructor. Since most instance methods will need to fetch the object, this is a good opportunity to do the check. For instance, the cairo extension does this:
static inline cairo_surface_object* cairo_surface_object_get(zval *zobj TSRMLS_DC)
{
cairo_surface_object *pobj = zend_object_store_get_object(zobj TSRMLS_CC);
if (pobj->surface == NULL) {
php_error(E_ERROR, "Internal surface object missing in %s wrapper, you must call parent::__construct in extended classes", Z_OBJCE_P(zobj)->name);
}
return pobj;
}
This has two disadvantages relatively to the previous method:
- It defers the check until an instance method is called, instead of immediately when the problem occurs (when the user-land constructor doesn't call the parent native constructor).
- The check is made on every method call, instead of only once.
However, this is by far a more popular approach, since it's simple and portable -- it uses only stable parts of the API.
A variant of this strategy is to centralize the object state validation in the ''[[#get_method|get_method]]'' handler and either throw a fatal error or return a method that throws an exception from the handler in case the object state is invalid. This makes it easier to fix current code without replacing the calls to ''zend_object_store_get_object'' in every method implementation.
Finally, another option, certainly less complex but more limiting, is to make the superclass constructor final.
===== Iterators =====
TODO
===== Serialization callbacks =====
TODO
===== Object creation and destruction =====
Object creation involves these steps:
- Allocate and initialize the underlying data structure
- Store the object
- Build a reference to the object
- (optional) Call the constructor
Calling the constructor is uncommon internally because there are easier ways to initialize the object (calling a ''zend_function'' is verbose). The initialization steps that are common to all the objects of a given type can be done in step 1. The initialization of a particular instance (which e.g. depends on some other data, the kind of data that would be passed to a constructor) can be done in a separate auxiliary C function. Every time an object is instantiated internally, the programmer must also call this function to do instance-specific initialization. A constructor is still necessary to properly support the ''new'' operator, but this strategy does not imply duplication of code -- the internal implementation of the constructor may rely on the same auxiliary function.
==== Data allocation and initialization ====
In general, this part is completely domain dependent. The programmer may allocate and initialize an object however he wants.
However, zend standard objects (those with a class entry) rely on the class entry's ''create_object'' handler. Typically, these have a data structure whose pointer can be passed to functions that expect ''zend_object*''. Hence, the typical class entry ''create_object'' handler will look like ''test_create_object'' in the example below:
typedef struct test_object {
zend_object std;
/* more properties follow */
...
} test_object;
static zend_object_handlers object_handlers;
static zend_object_value test_create_object(zend_class_entry *class_type TSRMLS_DC)
{
zend_object_value zov;
test_object *tobj;
tobj = emalloc(sizeof *tobj);
zend_object_std_init((zend_object *) tobj, class_type TSRMLS_CC);
#if PHP_VERSION_ID < 50399
zend_hash_copy(tobj->std.properties, &(class_type->default_properties),
(copy_ctor_func_t) zval_add_ref, NULL, sizeof(zval*));
#else
object_properties_init((zend_object*)tobj, class_type);
#endif
/* The destroy and free callbacks should be replaced if necessary */
zov.handle = zend_objects_store_put(tobj,
(zend_objects_store_dtor_t) zend_objects_destroy_object,
(zend_objects_free_object_storage_t) zend_objects_free_object_storage,
NULL TSRMLS_CC);
/* other specific stuff */
...
zov.handlers = &object_handlers;
return zov;
}
ZEND_MODULE_STARTUP_D(testext)
{
zend_class_entry ce;
zend_class_entry *ce_ptr;
memcpy(&object_handlers, zend_get_std_object_handlers(),
sizeof object_handlers);
/* replace necessary handlers */
...
INIT_CLASS_ENTRY(ce, "TestClass", ext_class_methods);
ce_ptr = zend_register_internal_class(&ce TSRMLS_CC);
ce_ptr->create_object = ce_create_object;
/* Other startup stuff */
...
return SUCCESS;
}
The ''create_object'' handler can also be ''NULL'', in which case the general operations listed in `test_create_object` are executed except a vanilla ''zend_object'' structure is initialized (instead of a ''test_object'').
==== Object storage ====
Objects are accessed through their references, the only thing linking the references to object instances is a integer (the object handle). This handle is a key that allows access to the object data structure. How this is done depends entirely on the type of the object.
Of particular relevance, are, of course, zend standard objects. These are stored in the '''objects store'''. The zend objects API exposes these functions:
/* Storage */
typedef void (*zend_objects_store_dtor_t)(void *object,
zend_object_handle handle TSRMLS_DC);
typedef void (*zend_objects_free_object_storage_t)(void *object TSRMLS_DC);
typedef void (*zend_objects_store_clone_t)(void *object,
void **object_clone TSRMLS_DC);
zend_object_handle zend_objects_store_put(void *object,
zend_objects_store_dtor_t dtor, zend_objects_free_object_storage_t storage,
zend_objects_store_clone_t clone TSRMLS_DC);
/* Retrieval */
void *zend_object_store_get_object(const zval *object TSRMLS_DC);
void *zend_object_store_get_object_by_handle(zend_object_handle handle TSRMLS_DC);
/* refcount related */
void zend_objects_store_add_ref(zval *object TSRMLS_DC);
void zend_objects_store_del_ref(zval *object TSRMLS_DC);
void zend_objects_store_add_ref_by_handle(zend_object_handle handle TSRMLS_DC);
oid zend_objects_store_del_ref_by_handle_ex(zend_object_handle handle,
const zend_object_handlers *handlers TSRMLS_DC);
void zend_objects_store_del_ref_by_handle(zend_object_handle handle TSRMLS_DC);
zend_uint zend_objects_store_get_refcount(zval *object TSRMLS_DC);
/* Misc */
zend_object_value zend_objects_store_clone_obj(zval *object TSRMLS_DC);
/* zend_object_store_set_object:
* It is ONLY valid to call this function from within the constructor of an
* overloaded object. Its purpose is to set the object pointer for the object
* when you can't possibly know its value until you have parsed the arguments
* from the constructor function. You MUST NOT use this function for any other
* weird games, or call it at any other time after the object is constructed.
* (This is rarely used)
*/
void zend_object_store_set_object(zval *zobject, void *object TSRMLS_DC);
/* Called when the constructor was terminated by an exception. Prevents the
* "destroy object" store callback from being called */
void zend_object_store_ctor_failed(zval *zobject TSRMLS_DC);
/* Used to destroy all the objects in the store */
void zend_objects_store_free_object_storage(zend_objects_store *objects TSRMLS_DC);
The objects store can actually store any type of data structures; the data structure doesn't have to be an extension of ''zend_object''. The header file ''zend_objects.h'' provides some functions to deal exclusively with zend standard objects:
/* To be used in the create_object class entry handler to initialize the
* zend_object structure */
void zend_object_std_init(zend_object *object, zend_class_entry *ce TSRMLS_DC);
/* Despite the name, this is actually related to object freeing. It frees all
* the memory used by the inner structures of zend_object */
void zend_object_std_dtor(zend_object *object TSRMLS_DC);
/* The default implementation of the create_object handler */
zend_object_value zend_objects_new(zend_object **object,
zend_class_entry *class_type TSRMLS_DC);
/* The default implementation of the free object store callback. Calls
* the PHP destructor, if any. */
void zend_objects_destroy_object(zend_object *object,
zend_object_handle handle TSRMLS_DC);
/* Alias of zend_object_store_get_object, except it returns a zend_object
* pointer instead of void* */
zend_object *zend_objects_get_address(const zval *object TSRMLS_DC);
/* Copies the properties of the old_object and calls the class entry
* clone handler. Used in the implementation of zend_objects_clone_obj
* In PHP > 5.3, it also initializes the properties before. */
void zend_objects_clone_members(zend_object *new_object,
zend_object_value new_obj_val, zend_object *old_object,
zend_object_handle handle TSRMLS_DC);
/* Allocates a new object with zend_objects_new and clones the members.
* It's the default implementation of the clone object handler. The fact
* it uses zend_objects_new means you almost certainly will want to replace
* the clone object handler when implementing internal classes. */
zend_object_value zend_objects_clone_obj(zval *object TSRMLS_DC);
/* default implementation of the free storage store callback.
* Calls zend_object_std_dtor and then frees the object itself */
void zend_objects_free_object_storage(zend_object *object TSRMLS_DC);
The function ''zend_objects_store_put'' adds an object to the store. This is the function that must be called during the creation of the object, as exemplified in the listing of the [[#Data allocation and initialization|section before]]. All of the three last arguments may be ''NULL''.
* '''Destructor''': If ''NULL'', ''zend_objects_destroy_object'' is used instead, which calls the PHP destructor, if any. This is called prior to the "free storage" callback when destroying the object. Cleanup of the memory allocated to the object data structures is left to the "free storage" callback. This callback is not called if the object construction failed. If passing a custom store destructor callback, calling the PHP destructor can be delegated to ''zend_objects_destroy_object''.
* '''Free storage''': Used to free the object data structures. For vanilla zend objects, this should be ''zend_objects_free_object_storage''; if extending zend standard objects, in the custom callback one should delegate to ''zend_objects_free_object_storage'' the cleanup of the ''zend_object'' field and of the outer object data structure (hence, the call to ''zend_objects_free_object_storage'' should be the last thing). There's no default if ''NULL'' is specified.
* '''Clone''': Most likely, this should be ''NULL''. One should only use this callback if implementing objects without class entries and using ''zend_objects_store_clone_obj'' as a ''[[#clone_obj|clone_obj]]'' handler. Then, that function will call this callback, which should allocate a new object, use the passed double indirection pointer to store a pointer to it, and clone the passed object into this new one.
After the call to ''zend_objects_store_put'', the object will have reference count = 1 in the store.
==== Object reference creation ====
If we're creating a zend standard object, the ''create_object'' handler already returned a ''zend_object_value''. The creation of an object reference zval is handled automatically by the ''new'' operator.
To instantiate new objects internally, the following macros are available:
int object_init(zval *arg);
int object_init_ex(zval *arg, zend_class_entry *ce);
/* This function requires 'properties' to contain all props declared in the
* class and all props being public. If only a subset is given or the class
* has protected members then you need to merge the properties separately by
* calling zend_merge_properties(). */
int object_and_properties_init(zval *arg, zend_class_entry *ce,
HashTable *properties);
/* This function should be called after the constructor has been called
* because it may call __set from the uninitialized object otherwise. */
void zend_merge_properties(zval *obj, HashTable *properties,
int destroy_ht TSRMLS_DC);
These all take an allocated and initialized (''INIT_ZVAL'') or partially initialized (''INIT_PZVAL'') ''zval'' pointer. ''object_init'' is not particularly useful, since it will instantiate a ''stdClass'' object. ''object_and_properties_init'' also allows efficient initialization of the object properties, but it has the limitations indicated in the comments. If all instances are to be initialized with the same property values, the default property values, defined when the class is registered, should be used instead.
==== Flow of the construction/destruction process ====
This is an overview of the process of object construction for zend standard object derivatives.
For internal instantiations:
- Allocate and (partially) initialize a ''zval *''.
- Call ''object_init_ex''. A pointer to the class entry should be available.
- Call the class entry's ''create_object'' handler.
- Allocate the object structure. This structure's first field should be a ''zend_object''.
- Call ''zend_object_std_init'' to initialize the ''zend_object'' part of the object data.
- Copy the default properties from class entry into the properties hash table of the new object. In PHP >= 5.3.99, ''object_properties_init'' should be called instead because non-dynamic properties are stored in C arrays instead of the properties hash table (though the hash table is still used when it's requested or when there are dynamic properties).
- Call ''zend_objects_store_put'', passing a custom "destroy object" callback which does cleanup specific to properly constructed objects and a custom "destroy object" callback that frees all the memory and other resources taken by the object (which is always called).
- Assign the return value of ''zend_objects_store_put'' to the ''zend_object_value'' that is to be returned.
- Set the field ''handlers'' of ''zend_object_value'' that's to be returned to the appropriate object handlers table.
- Set the zval type to ''IS_OBJECT'' and the value to that returned by the ''create_object'' handler.
- Do post-creation initialization on the new objected (the construction phase), typically through an auxiliary function.
For instantiations with ''new'':
- Call the class entry's ''create_object'' handler.
- (see above)
- ...
- Call the PHP constructor, if any. Typically, the internal implementation of the constructor delegates the construction task to the same auxiliary function referred to in the last step of the list before.
Stored objects should not be destroyed explicitly; in fact, the store API doesn't even expose a function to destroy a particular object. Instead, the destruction should be managed through the refcount. When the reference count hits 0, the store will call the object "destruct" store handler (if the object construction didn't fail) and the "free object" handler and remove the entry from its table. See also the ''[[#add_ref|add_ref]]'' and ''[[#del_ref|del_ref]]'' handlers.