This is a bottom-top approach to Zend 2 objects.
Despite the wording, this document is not a specification, it results from analyzing the PHP implementation. Since it attempts to extract general rules from specific code snippets, it may contain wrong inferences. If you find errors, correct them (requires permission to edit the wiki) or send an e-mail to glopes ~at~ nebm.ist.utl.pt.
Definitions
We'll deal with three separate entities here: references, objects and classes. Classes represent a type and define the behavior of all the objects (the instances of that class) of that type. Each object will typically have one or more references to it. These are abstract concepts; in terms of implementation in PHP, references are mapped to object zvals, and the two terms are here used interchangeably.
Internally, the word object is overloaded to also mean “references” -- to avoid confusion, I shall not use the term object to refer to references, reserving it to the objects (either the concept or the storage of their data in memory). Additionally, the term type may mean having a certain handler table. Two classes may have different behavior (different methods, etc.) and share the same handler table. Finally, the term reference, when applied to a zval, may also mean the zval is part of a reference set (Z_ISREF
) -- the ambiguity should be cleared in the context.
The object zval
A zval that is an object reference will have the type IS_OBJECT
. In this case, its value, which can be retrieved with Z_OBJVAL
, will be of type zend_object_value
, which is defined like this:
typedef struct _zend_object_value { zend_object_handle handle; /* retrieve with Z_OBJ_HANDLE(zval) */ zend_object_handlers *handlers; /* retrieve with Z_OBJ_HT(zval) */ } zend_object_value;
The field handle
(which can be retrieved with the Z_OBJ_HANDLE
macro family) identifies the object to which the reference refers among those of the same type; zend_object_handle
is actually just an integer -- that is, an object is uniquely identified by an integer and a zend_object_handlers
structure. Consequently, two references are identical in the sense of the ===
operator if and only if they share both the handlers and the handle.
The handlers
field has another purpose besides identifying the referred object. The structure it points to (the handler table) defines, at a low level, the behavior of the objects of that type. A specific handler function can be accessed with Z_OBJ_HANDLER(zval, hf)
.
The lifecycle of the references is associated with that of the objects. We likely want the object to be destroyed once there are no more references to it. The handler table has two entries for that purpose: add_ref
and del_ref
. The first should be called when a new reference for that object is created; the second when is deleted. Functions like zval_copy_ctor
, zval_ptr_dtor
and zval_dtor
take care of calling the handler. So there are two types of refcounts one should have in mind when dealing with objects -- those of references (i.e., the zvals) and those of the objects themselves.
The handler table
Let's now explore the members of the handler table, which define the behavior of a class at a low level. Once we introduce the zend standard object, we'll see what their default values are (TODO).
typedef struct _zend_object_handlers { /* general object functions */ zend_object_add_ref_t add_ref; zend_object_del_ref_t del_ref; zend_object_clone_obj_t clone_obj; /* individual object functions */ zend_object_read_property_t read_property; zend_object_write_property_t write_property; zend_object_read_dimension_t read_dimension; zend_object_write_dimension_t write_dimension; zend_object_get_property_ptr_ptr_t get_property_ptr_ptr; zend_object_get_t get; zend_object_set_t set; zend_object_has_property_t has_property; zend_object_unset_property_t unset_property; zend_object_has_dimension_t has_dimension; zend_object_unset_dimension_t unset_dimension; zend_object_get_properties_t get_properties; zend_object_get_method_t get_method; zend_object_call_method_t call_method; zend_object_get_constructor_t get_constructor; zend_object_get_class_entry_t get_class_entry; zend_object_get_class_name_t get_class_name; zend_object_compare_t compare_objects; zend_object_cast_t cast_object; zend_object_count_elements_t count_elements; zend_object_get_debug_info_t get_debug_info; zend_object_get_closure_t get_closure; } zend_object_handlers;
Except where indicated, the arguments are guaranteed not to be null pointers.
add_ref
void (*add_ref)(zval *object TSRMLS_DC)
- Called when a new zval referring to the object is created. Called by
zval_copy_ctor
. - It may also be called when there is a need to hold some other kind of reference to the object. For instance, some instance method of an object a of class A may create and return an object b of class B that depends on data of the object a that spawned it. In that case, a possible strategy is to add reference to the a when b is created and store in b the handle of a so that a reference can be deleted when b is destroyed. Alternatively, b may store (e.g. as a property) a zval object to a.
- Should not be
NULL
.
del_ref
void (*del_ref)(zval *object TSRMLS_DC)
- Called when a zval referring to the object is destroyed. Called by
zval_dtor
and therefore by functions that callzval_dtor
such aszval_ptr_dtor
and theconvert_to
family. - See also add_ref.
- Should not be
NULL
.
clone_obj
zend_object_value (*clone_obj)(zval *object TSRMLS_DC)
- Called when an object is to be cloned (associated with usage of the
clone
operator in user space). - Should return a
zend_object_value
that refers to a newly created object that is equal to the object referred to the passed reference. The two objects should not be identical, i.e.,==
applied to the references should returntrue
but===
should returnfalse
. Thecompare_objects
handlers must be the same since that is a requisite for==
returningtrue
, but typically, as one would want the two objects to have identical behavior, they ought to share all the handlers. - Zend standard object extensions can call
zend_objects_clone_members
to copy properties; this function also calls the class entryclone
function (typically__clone
). - The created object should be initialized as if it had one reference.
- May be
NULL
to forbid cloning.
read_property
zval *(*read_property)(zval *object, zval *member, int type TSRMLS_DC)
- Retrieves a property of an object as a pointer to a
zval
; corresponds to$obj->prop
in (mainly) a reading context in userspace. - If the argument
member
is not of typeIS_STRING
, you convert it (after deep copying!). You may useconvert_to_string
function for the conversion. - The argument
type
is of the BP family, which consists of:
/* var status for backpatching */ #define BP_VAR_R 0 /* read */ #define BP_VAR_W 1 /* write */ #define BP_VAR_RW 2 /* read/write */ #define BP_VAR_IS 3 /* check for existence */ #define BP_VAR_NA 4 /* if not applicable; unused? */ #define BP_VAR_FUNC_ARG 5 /* function argument */ #define BP_VAR_UNSET 6 /* unset */
- The types
BP_VAR_R
andBP_VAR_IS
are the most relevant here. Typically, this decides how chatty the implementation will be. Note that it's thehas_property
that is called when checking the existence of the property;BP_VAR_IS
is used when retrieving a property to check whether it has a (sub-)dimension or (sub-)property, as inempty($rarF->prop[7][8])
. - However,
BP_VAR_W
,BP_VAR_RW
andBP_VAR_UNSET
are also possible values if theget_property_ptr_ptr
handler is undefined or fails. These types are used in write-like operations wherein a (sub-)dimension or (sub-)property of the the property value is being targeted (e.g.$obj->prop[32] = $h
orunset($obj->prop[32])
; the typeBP_VAR_W
may also appear when assigning or passing the property (or (sub-)property/dimension thereof) by reference. If these cases are to be supported, one should return either a reference (in theZ_ISREF
sense) or a proxy object (see theget
handler for more information), otherwise one should warn the user -- for instance, the default handler emits a warning if returning for write a zval that is not of typeIS_OBJECT
and is not referenced anywhere else because the write would necessarily have no effect (an object of typeIS_OBJECT
is permitted because it may be a proxy object). - The reference count of a returned zval which is not otherwise referenced by the extension or the engine's symbol table should be 0. Likewise, the reference count of a zval being returned that exists elsewhere should not be incremented by this handler (it might be, but only on the account of some side effect, for instance creating the property on the fly and storing it in a hash table for future retrieval, not owing to the call to the handler per se).
- Note this handler itself (and the other ones) has no notion of accessibility.
- Should return
EG(uninitialized_zval_ptr)
if the property is undefined. - Should not be
NULL
, even for classes that have no properties, though it's not strictly forbidden. An empty implementation can be:
zval *read_property_empty_implementation(zval *object, zval *member, int type TSRMLS_DC) { /* maybe raise an error/exception here */ return EG(uninitialized_zval_ptr); }
- NOTE (as of PHP 5.3.2): You may think the behavior of
read_property
is the same as that ofread_dimension
whenget_property_ptr_ptr
is undefined/fails. There are, however, subtle differences.
- For pre- and post-increments and -decrements on properties, there is a pair of
read_property
/write_property
calls; theread_property
has typeBP_VAR_R
. For the same operations on dimensions, there is onlyread_dimension
call withBP_VAR_RW
type; the operation may succeed only if a reference (Z_ISREF
) or proxy object are returned. Note that compound assignments on dimensions (e.g.$obj['index'] += 1
) are handled with a pair of operations, just like properties. - If
read_property
returns a zval with refcount 1 not belonging to a reference set in the context of a write-like operation (see discussion of thetype
argument above), this zval will be turned into a reference. In the case ofread_dimension
, a notice would be emitted and a reference set with the left part of the assignment and the dimension would not be built. Note that this special “turn into ref” case would not work if the returned zval had a higher refcount. Consider the following implementations:
zval *z; zval *read_property(zval *object, zval *offset, int type TSRMLS_DC) { return z; } void write_property(zval *object, zval *offset, zval *value TSRMLS_DC) { z = value; /* Z_SET_ISREF_P(z); -- if uncommented, both would work*/ zval_add_ref(&value); }
Then, these two scripts would have different results:
$obj['prop'] = "hhh"; $a = &$obj['prop']; $a = "bbb"; echo $obj['prop']; //echoes "bbb"
$str = "hhh"; $obj['prop'] = $str; $a = &$obj['prop']; $a = "bbb"; echo $obj['prop']; //echoes "hhh"
write_property
void (*write_property)(zval *object, zval *member, zval *value TSRMLS_DC)
- Writes the value of a property of an object; corresponds to
$obj->prop
in a writing context in userspace. - If the
member
argument is not a string, it should be (deep) copied and converted into one. - The calling function does not admit that the value will be accepted and stored. Hence, if the value is to be stored by the handler, its refcount should be incremented. No modifications to the value are allowed. If one is to modify the value before storing it, one must deep copy it before (i.e., include a call
zval_copy_ctor
). The only exception is if the value's refcount is 0 -- in that case, one may modify it at will and, if one wants to copy the value zval into another zval, one can make a shallow copy. - Separate zvals that are references (in the sense of
Z_ISREF
). Note that this handler is not meant to deal with reference assignments such as$obj->prop = &$var
or$var = &$obj->prop
, which are the correct ways of making property values part of a reference set (the engine will callget_property_ptr_ptr
or, failing that,read_property
). - Should not be
NULL
, even for classes that have no modifiable properties, though it's not strictly forbidden. An empty implementation will do nothing except maybe raise an error/exception.
read_dimension
zval *(*read_dimension)(zval *object, zval *offset, int type TSRMLS_DC)
- This is similar to
read_property
, except it's called in response to attempts to treat the object as an array, as in$obj['key']
in (mainly) a reading context. - You may not modify (except transiently) the
offset
argument. - The argument
offset
can be a CNULL
(when$obj[]
) is used. Despite the name,offset
may be of any type of zval -- if it is an object reference with aget
handler, you may want to call it and use that result as an offset instead. - The remarks made in
read_property
with respect to thetype
argument also apply. - Since there's no analogous to
get_property_ptr_ptr
, you ought to return a reference (in the sense ofZ_ISREF
) or a proxy object (see theget
handler) in write-like contexts (typesBP_VAR_W
,BP_VAR_RW
andBP_VAR_UNSET
), though it's not mandatory. Ifread_dimension
is being called in a write-like context such as in$val =& $obj['prop']
, and you return neither a reference nor an object, the engine emit a notice. Obviously, returning a reference is not enough for those operations to work correctly, it is necessary that modifying the returned zval actually has some effect. Note that assignments such as$obj['key'] = &$a
are still not possible -- for that one would need the dimensions to actually be storable as zvals (which may or may not be the case) and two levels of indirection. - The remarks made relative to the refcount of the returned value in
read_property
also apply. Should return a CNULL
in case the offset do not exist (the engine will then useerror_zval
oruninitialized_zval
depending on whether it's a read or write context). - May be
NULL
when the object is not to be treated as an array.
write_dimension
void (*write_dimension)(zval *object, zval *offset, zval *value TSRMLS_DC)
- This is similar to
write_property
, except it's called in response to attempts to treat the object as an array, as in$obj['key']
in a writing context. - You may not modify (except transiently) the
offset
argument. - The argument
offset
can be a CNULL
(when$obj[]
) is used. It can be any type of zval -- if it is an object reference, you may call itsget
handler and use the result as the offset instead. - The same remarks made in
write_property
apply -- should increment the refcount of value if storing it and should not change the value zval in most circumstances (a deep copy should be made first). - If a reference is passed, it should be separated. While you may think you may want to store a reference so that the value of the dimension may be changed indirectly (through another symbol), this is not the way (what you can do is
$obj['key'] = $a; $a = &$obj['key']
-- the first assignment is handled bywrite_dimension
the second byread_dimension
returning a reference or a proxy object). - May be
NULL
when the object is not to be treated as an array.
get_property_ptr_ptr
zval **(*get_property_ptr_ptr)(zval *object, zval *member TSRMLS_DC)
- Returns a property with double indirection so that the caller may directly replace the zval. This may be for efficiency reasons (a read/write pair of calls would otherwise be needed and would be unnecessary if the underlying storage of the properties are in fact zvals) or because the nature of the operation requires double indirection -- namely, send by reference and assign by reference of object properties.
- If the
member
argument is not a string, it should be converted. - The rules about the refcount of the returned value given for the
read_property
handler also apply here, though it doesn't make much sense to return a pointer to memory that is not held by the extension or the engine here (the case with refcount 0). - If the property does not exist, the handler may try to create it on the fly. If one initializes these properties to the same value, one should consider using a single initialization zval, for instance EG(uninitialized_zval) -- allocate a
zval*
, give it the value ofEG(uninitialized_zval_ptr)
, increment its refcount and return the address the created pointer (see the behavior ofzend_std_get_property_ptr_ptr()
). - Returning a C
NULL
signifies failure and causes a fallback to theread_property
orwrite_property
handlers. - Prefer an empty implementation (always returning a C
NULL
) to aNULL
in the handler table. ANULL
in the handler table ought to have the same effect, but there are bugs.
get
zval* (*get)(zval *object TSRMLS_DC)
- This handler is called when attempting to treat the object as a scalar value in a read context. That situation arises when using the pre- and post-increment and -decrement operators and compound assignment operators (e.g.
$obj++
and$obj += 6
) -- it is then followed by a call toset
-- and as a fallback for type conversions in case there is nocast_object
handler (as a preferred method relatively tocast_object
in a few circumstances such as when defining a constant or comparing objects to scalars). - A common application for
get
/set
handlers is when implementing proxy objects. These are used when the underlying storage of the properties or dimensions are not zvals (e.g. when the PHP object is an interface to an object in another language). In that case, it would be impossible to have those properties/dimensions part of a reference set and operate in this manner:
$a = &$obj->prop; $a++; $a = 6;
- Proxy objects make this possible. If one returns from
read_property
orread_dimension
anIS_OBJECT
zval withget
andset
handlers, the read in the post-increment would be handled by theget
handler and the writes in the post-increment and the assignment would be handled by theset
handler. - Should return a newly allocated zval with refcount 0.
- One should not expect calls to
get
being followed by calls toset
in the context of compound assignments and increments/decrements. In$obj['prop']++
,read_dimension
would be called withoffset
“prop”
andtype
BP_VAR_R
. Suppose it then returns a proxy element. Theget
handler of this proxy handler would be called in order to generate a zval; the zval would be separated if not a reference and incremented; then notset
but insteadwrite_dimension
would be called so as to write the result. - Should not return
NULL
; if implemented must return a valid zval. - May be
NULL
, in that case the object cannot be treated as a scalar in the mentioned circumstances.
set
void (*set)(zval **object, zval *value TSRMLS_DC)
- This handler is called when attempting to make a (non-reference) assignment to an object zval, including when using the pre- and post-increment and -decrement operators and compound assignment operators (e.g.
$obj++
and$obj += 6
) -- in this case, preceded by a call toget
. - Note the double indirection on the
object
argument. The pointed zval may be changed or completely replaced by changing the value of*object
. Remember to adjust the refcounts and consider whether the zval is part of a reference set. - The remarks made in
write_property
about thevalue
argument also apply. - See also the description of
get
. - May be
NULL
, in that case the object cannot be treated as a scalar in the mentioned circumstances.
has_property
int (*has_property)(zval *object, zval *member, int has_set_exists TSRMLS_DC)
- This handler is called whenever the engine needs to determine whether a property exists.
- If the
member
argument is not a string, it should be (deep) copied and converted. - The parameter
has_set_exists
can take the following values:- 0 -- check whether the property exists and is not
NULL
; used by theisset
operator - 1 -- check whether the property exists and is true; semantics of
empty
; one may want to usezend_is_true
- 2 -- check whether the property exists, even if it is
NULL
; used by theproperty_exists
function. Note that this parameter slightly differs fromhas_dimension
'scheck_empty
in that the latter cannot take the value2
.
- An empty implementation ought not to emit an error/exception (or have any other side effects) even if the type does not admit properties and especially if
has_set_exists
is 2, so thatproperty_exists
can be quiet. - Read also the note on the usage of the
BP_VAR_IS
type for theread_property
handler. - Should return either
0
(doesn't have the property) or1
(has the property). - Should not be
NULL
, though it's not strictly forbidden by the engine.
unset_property
void (*unset_property)(zval *object, zval *member TSRMLS_DC)
- Called in order to unset an object property.
- If the
member
argument is not a string, it should be (deep) copied and converted. - If
member
refers to a property that does not exist, this function should fail silently (no notices!). However, if the object type does not support properties, an error/exception may be emitted. - Should not be
NULL
, though it's not strictly forbidden by the engine.
has_dimension
int (*has_dimension)(zval *object, zval *member, int check_empty TSRMLS_DC)
- Determines whether an object has a certain dimension.
- The argument
check_empty
has the same meaning ashas_property
'shas_set_exists
parameter, with the exception that it cannot take the value2
. - This handler should have no side effects.
- Read also the note on the usage of the
BP_VAR_IS
type for theread_property
handler. - Should return either
0
(doesn't have the dimension) or1
(has the dimension). - May be
NULL
when the object is not to be treated as an array.
unset_dimension
void (*unset_dimension)(zval *object, zval *offset TSRMLS_DC)
- Called in order to unset an object's dimension.
- The argument
offset
may be of any type; if it is an object with aget
handler, one may want to call it and use instead the result as the offset. - It should fail silently (without notices) in the case the offset refer to a dimension that does not exist.
- May be
NULL
if the object is not to be treated as an array.
get_properties
HashTable *(*get_properties)(zval *object TSRMLS_DC)
- Retrieves the object as a hash table. This is usually a hash table containing the object instance properties and some code may (incorrectly) use this hash table to retrieve object properties. This is function is used, even in preference to
cast_object
, in explicit conversions (i.e.convert_to_array
family andconvert_to_explicit_type
, notconvert_object_to_type
) to convert an object into an array. - In practice, you may use this function return other data, for instance dimensions. Since several array functions (such as
end
,prev
,next
,reset
,current
,key
,array_walk
,array_walk_recursive
andarray_key_exists
) call this handler when an object is passed, it can be used to provide a more array-like experience of the object (togetherread_dimension
,write_dimension
,has_dimension
,unset_dimension
andcount_elements
). - The garbage collector uses this handler to reach the zvals that the object is holding. If the hash table is lazily generated (on the first call to the handler) and it hasn't been built yet (it's the first call), it may be appropriate to refuse to do so (and return
NULL
) on calls by the garbage collector. This is especially true if such generation involves the creation of new zvals. The globalGC_G(gc_active)
tells whether the garbage collector is running. - The
zend_parse_parameters
has the specifiers H and A which accept both arrays and objects. - The
Z_OBJPROP
macro family are shortcuts to access this handler. They assume the handler exists! TheZ_OBJDEBUG
macro family fall back on this handler ifget_debug_info
doesn't exist. - If the underlying storage of the hash table values are in fact zvals, you may return a hash table that stores the same
zval *
values. Depending on how the hash table is then exposed in userspace (whether reference sets are separated), this may allow indirect modification of the underlying storage. If those zvals are stored in a hash table, you can go further and return the hash table itself -- this will generally still not allow replacement/addition/deletion of the hash table's values in user space (e.g. turning an object into an array requires the hash table to be copied), yet may be faster and allow internal code to replace/add/delete entries directly to the hash table. - The handler owns the hash table. Typically, this handler always returns the same hash table, which accompanies the life cycle of the object (is created when the object is created, etc.).
- The Zend engine does not forbid it to be
NULL
, but several extensions (including the standard extension) assume it exists. The built-in functionget_object_vars
assumes a Zend standard object if it exists. Prefer an implementation that returns an empty hash table. - See also
get_debug_info
.
get_method
zend_function *(*get_method)(zval **object_ptr, char *method, int method_len TSRMLS_DC)
- Called in order to fetch a method as
zend_function
. - Should return
NULL
if the method does not exist, otherwise should return azend_function
. The rules for who owns the return value are as follows:- If the type is
ZEND_INTERNAL_FUNCTION
, then it's owned by the caller if and only if the subfieldcommon.fn_flags
has the flagZEND_ACC_CALL_VIA_HANDLER
. However, note that if the caller then uses the return value to make a function call, it should not free it since it will already have been done. - If the type is
ZEND_OVERLOADED_FUNCTION
orZEND_OVERLOADED_FUNCTION_TEMPORARY
, then it's owned by the caller. - For all other cases, the caller is not responsible for freeing the return.
- If the caller owns the result, and the type is not
ZEND_OVERLOADED_FUNCTION
, it should also free (withefree
) the subfieldcommon.function_name
.
- The argument
object_ptr
is given with double indirection. Altering*object_ptr
allows one to change the this pointer passed to the method and the called scope into the class entry of the the written value. The refcount of the original value should be decreased, the new value's should be increased (or set one if created from scratch). (unconfirmed) - One ought to convert the method name to lowercase to mimic the usual (half-assed) case insensitiveness of method names. See
zend_str_tolower_copy
. - May be
NULL
if the object is not to support method calls.
call_method
int (*call_method)(char *method, INTERNAL_FUNCTION_PARAMETERS)
- This method is called whenever the engine tries the call a function with type
ZEND_OVERLOADED_FUNCTION
orZEND_OVERLOADED_FUNCTION_TEMPORARY
. It's conceptually related to the__call
magic method. - The
method
argument is a string that should identify the function to be called. - This method is unique in that the first parameter is not the an object zval pointer. A zval pointer is included in the
INTERNAL_FUNCTION_PARAMETERS
and can be retrieved withgetThis()
. - May be
NULL
, but in that case you should not returnzend_functions
's of typeZEND_OVERLOADED_FUNCTION
orZEND_OVERLOADED_FUNCTION_TEMPORARY
from get_method or get_constructor (and neither should get_closure ifobject_ptr
is filled with object zvals that do not have this handler).
get_constructor
zend_function *(*get_constructor)(zval *object TSRMLS_DC)
- This handler has the same semantics as get_method and is called to retrieve a function that is to perform initialization operations on the object.
- May be
NULL
.
get_class_entry
zend_class_entry *(*get_class_entry)(const zval *object TSRMLS_DC)
- Gives a pointer to a
zend_class_entry
. This structure provides a scope for object operations and defines PHP classes. The default handlers defer to this structure for much of their behavior; additionally, much functionality, such as reflection and theinstaceof
operator is restricted to PHP classes. - If implemented, all objects of the same class should have a
get_class_entry
handler returning the same value. Should not returnNULL
. - You may use the
Z_OBJCE
macro family for accessing the return of the handler. It resolves tozend_get_class_entry
, which either returnsNULL
and emits a fatal error if the handler does not exist or returns the (non-null for correct implementations) result of the this handler. - The
IS_ZEND_STD_OBJECT
andHAS_CLASS_ENTRY
macros (the last one should only be used if the zval is known to be an object reference) are a shortcut for determining whether a zval is of a type which has this handler implemented. - Will never be
NULL
for Zend standard objects (and derivations thereof) and will beNULL
for all other objects (by definition).
get_class_name
int (*get_class_name)(const zval *object, char **class_name, zend_uint *class_name_len, int parent TSRMLS_DC)
- Extracts a class name for display or reflection purposes. This name has no special meaning.
- If
parent
is0
, the name of the class of the passed object is being requested, otherwise it's the parent class of the passed object. The handler may returnFAILURE
if there is no parent class or it doesn't know. - On success,
*class_name
should be set with a pointer to a null-terminated string allocated with non-persistent storage (emalloc
) and*name_len
should be set with the length of*class_name
(excluding terminator). - Should return
SUCCESS
orFAILURE
. If it fails,*class_name
and*class_name_len
should retain their original values or be set toNULL
/0
. - May be
NULL
. Note that, as of PHP 5.3.2, some portions of the standard extension expect the handler to exist and not fail whenparent
is0
.
compare_objects
int (*compare)(zval *object1, zval *object2 TSRMLS_DC)
- Compares two objects. Used for the operators
==
,!=
,<
,>
,<=
and>=
. - The implementations should follow these rules -- for any objects a, b and c that share the same compare handler:
- compare(a, a) = 0
- sign(compare(a, b)) = -sign(compare(b, a)) where sign(x) is 1 if x is positive, -1 if it's negative and 0 if it's 0.
- if compare(a, b) = 0 and compare(b, c) = 0, then compare(a, c) = 0
- This means one must implement a total order.
- One may find an equivalent set of conditions on the documentation of Java's java.lang.Comparable.compareTo(T).
- The handler may return only
-1
,0
and1
(a < b, a = b and a > b). If not, one is encouraged to implement the handler so that compare(a, b) > compare(a, c), then compare(b, c) < 0. - Should not be
NULL
; a possible simple implementation is just returning the result of an object handle subtraction.
cast_object
int (*cast)(zval *readobj, zval *retval, int type TSRMLS_DC)
- Called when an object is to be converted into another type.
- If not defined or if the call fails, the engine will use fallback strategies that include calling
get
, or using a number of default conversion strategies (the strategies used for the standard objects). - The
readobj
contains the object to be converted; it should not be modified in any way. - The handler may assume
readobj
andretval
have different values. - The
retval
is an allocated zval on which the handler should write the result. It should first be initialized (INIT_ZVAL
) ignoring its previous value. - In case of error,
FAILURE
should be returned. If theretval
was already initialized and is holding further resources, it should be destroyed (as inzval_dtor
) by the handler; if it was not initialized, it should be left untouched. In case of success,SUCCESS
should be returned. - See also
get
andget_properties
. - May be
NULL
.
count_elements
int (*count_elements)(zval *object, long *count TSRMLS_DC)
- Called to determine the count of some countable object. A count is a non-negative value.
- Objects that have array-like access will probably want to implement this, so that they can behave more like an array.
- Note that this handler is not used by the engine itself, only by
count
and other extensions. - This handler writes a non-negative number in
*count
and returnsSUCCESS
if the passed object is countable; returnsFAILURE
otherwise. - May be
NULL
if the type does not support the notion of “countable”; the effect would be the same of having an implementation always returningFAILURE
.
get_debug_info
HashTable *(*zend_object_get_debug_info_t)(zval *object, int *is_temp TSRMLS_DC)
- Returns a hash table with arbitrary key/value pairs for debugging purposes.
- The
Z_OBJDEBUG
macro is a shortcut to access this handler; it may be used if one knows the handler not to beNULL
. - The
is_temp
argument cannot beNULL
. The value1
should be written in*is_temp
if the returned hash table is owned by the caller (and hence the caller must destroy and free it withzend_hash_destroy
andefree
); otherwise0
should be written. - Should not return
NULL
. - Avoid having this handler set to
NULL
; although the engine does not require its existence, the standard extension does (as of PHP 5.3.2).
get_closure
int (*get_closure)(zval *obj, zend_class_entry **ce_ptr, zend_function **fptr_ptr, zval **zobj_ptr TSRMLS_DC)
- This handler allows the object to be used as a function.
- The argument
ce_ptr
will not beNULL
and should be filled with the scope of the function orNULL
. The written value will be used as the called scope. The calling scope will be taken from the scope associated with the returnedzend_function
. Only under exceptional circumstances will it be used as a calling scope (maybe internal functions where thezend_function
has no associated scope – unconfirmed). - The argument
ftpr_ptr
should be populated with the desired function. The caller is not responsible for freeing it, so the structure should accompany the life cycle of the class or object. It's outside the scope of this document to describe this structure, see for examplezend_register_functions
for how to create internal functions. - The argument
zobj_ptr
may be NULL; if it isn't,*zobj_ptr
is to be filled withNULL
or, in case the function is an instance method of a standard object stored inEG(objects_store)
, the object it refers to. One should not increment the refcount of that object only because one is passing it to the caller. - See also
get_method
. - Returns
SUCCESS
orFAILURE
. - May be
NULL
.
The zend_class_entry
TODO
Default handlers
TODO
PHP internal class declaration
Let's assume all the internal functions and custom object handlers are written. A PHP class declaration can then be divided in these tasks (some may be omitted):
- Definition of
zend_function_entry
array, which groups the internal functions that were defined. - Definition and initialization of the handlers table.
- Initialization of the class entry.
- Registration of the class.
- Declaration of static and instance properties and constants.
- Other tweaks of the class entry.
We'll cover this items and then address the question of how to properly define a class so that it can be extended in userspace.
zend_function_entry array
The zend_function_entry
structure contains the name of the method, a pointer to the (native) function that implements it, arginfo (describing arginfo structures is out of the scope of this text), and some flags for the method. The array is traditionally declared as a static global variable. Its purpose is to group and qualify the functions so that they can be converted to zend_function
structures.
The array is terminated with a zeroed structure. Several macros exist for declaring the zend_function_entry
structures. The most important are:
PHP_ME(classname, name, arg_info, flags) PHP_MALIAS(classname, name, alias, arg_info, flags) PHP_NAMED_ME(zend_name, name, arg_info, flags) PHP_ME_MAPPING(name, func_name, arg_types, flags)
The standard way to declare a method is to use PHP_ME
. It takes, in this order:
- The name of the class. This is an arbitrary name, not reflected in userspace, that is consistent with how the method's internal implementations were declared.
- The name of the method. This is how the method will be called in userspace AND how it was declared with
PHP_METHOD
. If you need those two to differ, you should usePHP_MALIAS
. - An arginfo structure.
- A bitmask defining the accessibility of the method.
The macro PHP_ME
can be used when the method implementation was declared in a standard way, i.e., with PHP_METHOD(classname, name)
.
The bitmask is built with the ZEND_ACC
family of macros. Let's see the relevant part of the family. Some are used only for classes or properties, not methods:
#define ZEND_ACC_STATIC 0x01 /* fn_flags, zend_property_info.flags */ #define ZEND_ACC_ABSTRACT 0x02 /* fn_flags */ #define ZEND_ACC_FINAL 0x04 /* fn_flags */ #define ZEND_ACC_IMPLEMENTED_ABSTRACT 0x08 /* fn_flags */ #define ZEND_ACC_IMPLICIT_ABSTRACT_CLASS 0x10 /* ce_flags */ #define ZEND_ACC_EXPLICIT_ABSTRACT_CLASS 0x20 /* ce_flags */ #define ZEND_ACC_FINAL_CLASS 0x40 /* ce_flags */ #define ZEND_ACC_INTERFACE 0x80 /* ce_flags */ #define ZEND_ACC_INTERACTIVE 0x10 /* fn_flags */ #define ZEND_ACC_PUBLIC 0x100 /* fn_flags, zend_property_info.flags */ #define ZEND_ACC_PROTECTED 0x200 /* fn_flags, zend_property_info.flags */ #define ZEND_ACC_PRIVATE 0x400 /* fn_flags, zend_property_info.flags */ #define ZEND_ACC_PPP_MASK \ (ZEND_ACC_PUBLIC | ZEND_ACC_PROTECTED | ZEND_ACC_PRIVATE) #define ZEND_ACC_CHANGED 0x800 /* fn_flags, zend_property_info.flags */ #define ZEND_ACC_IMPLICIT_PUBLIC 0x1000 /* zend_property_info.flags; unused (1) */ #define ZEND_ACC_CTOR 0x2000 /* fn_flags */ #define ZEND_ACC_DTOR 0x4000 /* fn_flags */ #define ZEND_ACC_CLONE 0x8000 /* fn_flags */ #define ZEND_ACC_ALLOW_STATIC 0x10000 /* fn_flags */ #define ZEND_ACC_SHADOW 0x20000 /* fn_flags */ #define ZEND_ACC_DEPRECATED 0x40000 /* fn_flags */ #define ZEND_ACC_CLOSURE 0x100000 /* fn_flags */ #define ZEND_ACC_CALL_VIA_HANDLER 0x200000 /* fn_flags */ /* (1) ZEND_ACC_IMPLICIT_PUBLIC is unused since zend_do_declare_implicit_property is ifdef'd out */
These apply to methods:
ZEND_ACC_PUBLIC
,ZEND_ACC_PROTECTED
,ZEND_ACC_PRIVATE
- exactly one of these these flags'must
' be included.ZEND_ACC_STATIC
,ZEND_ACC_ABSTRACT
andZEND_ACC_FINAL
- define a method as static, abstract or final, respectively.ZEND_ACC_ALLOW_STATIC
- allows an instance method to be called statically; also allows an instance method to assume a$this
from an incompatible context (see implementation for opcodeINIT_STATIC_METHOD_CALL
). New code ought not to set this flag.ZEND_ACC_DEPRECATED
- marks a method as deprecated.
These also apply to methods, but you needn't include them in your function entries:
ZEND_ACC_IMPLEMENTED_ABSTRACT
- used if the method is declared abstract somewhere up the hierarchy. Despite the name, the method may have no implementation -- if an abstract subclass does not implement an abstract method from the superclass, the subclass copy of the method will have this flag set. Do not set this; it will be set automatically.ZEND_ACC_CHANGED
- used for methods of a subclass that had their visibility increased from protected to public when overridden. Do not set this; it will be set automatically.ZEND_ACC_CALL_VIA_HANDLER
is applied tozend_function
structures that are generated on-the-fly in response to calls to __call, __callstatic, or by theget_method
handler. This determines the memory freeing procedure. It also allows overriding pass-by-value semantics of functions withzend_call_function
(used bycall_user_func_array
). See alsoget_method
.ZEND_ACC_CLONE
marks a method as a clone method. This is automatically set for methods named__clone
, but appears to have no effect at the engine level (the standard clone handler looks for a method called__clone
, with no regard for this flag).ZEND_ACC_CTOR
, although commonly manually set in the arginfo, is set automatically for methods with the appropriate name (either old-style or new -style). Setting this manually a method that would not be selected as a constructor is an error.ZEND_ACC_DTOR
, is set automatically for methods with the name__destruct
. Beyond that, it has no effect at the engine level.
The macro PHP_MALIAS(classname, name, alias, arg_info, flags)
allows you to declare a method with a name and expose another name in user space, e.g.:
PHP_METHOD(myclass, mymethod) { ... } ... PHP_MALIAS(myclass, myMethodInUserspace, mymethod, NULL, 0), ...
The macro PHP_NAMED_ME(zend_name, name, arg_info, flags)
goes lower, you can specify the actual name of the C function you implemented, e.g.:
ZEND_NAMED_FUNCTION(my_arbitrary_name) { ... } /* this resolves to my_arbitrary_name(INTERNAL_FUNCTION_PARAMETERS) { } */ ... PHP_NAMED_ME(my_arbitrary_name, myMethodInUserspace, NULL, 0), ...
Finally, PHP_ME_MAPPING(name, func_name, arg_types, flags)
is usually to expose methods that also have a non-OOP interface, e.g.:
PHP_FUNCTION(ext_func) { zval *this; /* Use zend_parse_method_parameters to parse parameters in double interface implementations */ if (zend_parse_method_parameters(ZEND_NUM_ARGS() TSRMLS_CC, getThis(), "O" &this, ext_class_ce_ptr) == FAILURE) { return; } /* alternative with plain zend_parse_parameters */ /* this = getThis(); * if (this == NULL) { * if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "O", * &this, ext_class_ce_ptr) == FAILURE) { * return; * } * else if (zend_parse_parameters_none() == FAILURE) { * return; * } */ } ... static zend_function_entry ext_functions[] = { PHP_FE(ext_func, NULL), ... {NULL, NULL, NULL, 0, 0} } ... static zend_function_entry ext_class_methods[] = { PHP_ME_MAPPING(myMethodName, ext_func, NULL, 0), ... {NULL, NULL, NULL, 0, 0} }
Namespaced names can be built using ZEND_NS_NAME(namespace, name)
. Don't prefix the namespace with \
.
Setup the Handlers Table
When one is implementing internal PHP classes, it is almost always undesirable to replace all of the standard handlers. The usual procedure for overriding the handlers follows these steps:
- Implement the handlers you wish to override (read this section to understand their semantics).
- Define a global variable of type
zend_object_handlers
. Mostly likely, the handler table will need not be referred to from other compilation units, so it can have file scope. - On module startup, the standard handlers are copied into the defined
zend_object_handlers
variable. - The fields corresponding to the handlers that are to be overridden are written with pointers to the custom handlers that were implemented.
The handler table should not be initialized with initializer lists. Otherwise, when new handlers are added to the language, they will take the value NULL
instead of the default value.
Example:
static zend_object_handlers myclass_handlers; ... static HashTable *myclass_object_debug_info(zval *object, int *is_temp TSRMLS_DC) { ... } static int myclass_object_compare_objects(zval *object1, zval *object2 TSRMLS_DC) { ... } static zend_object_value myclass_object_clone(zval *object TSRMLS_DC) { ... } ... ZEND_MODULE_STARTUP_D(myext) { ... memcpy(&myclass_handlers, zend_get_std_object_handlers(), sizeof myclass_handlers); myclass_handlers.get_debug_info = myclass_object_debug_info; myclass_handlers.compare_objects = myclass_object_compare_objects; myclass_handlers.clone_obj = myclass_object_clone; ... }
Beware of the clone_obj handler
Most likely, the default clone_obj
handler will have to be replaced because it assumes the create_object
class entry handler is not replaced and directly allocates a standard zend_object
data structure. Therefore, worse than the added custom fields to the object structure not being initialized (because the clone handler only knows how to so a shadow copy of the standard properties), these fields will not even exist because only a plain zend_object
is allocated.
You may also choose not to support the clone operation by setting the clone_obj
handler to NULL
.
Initialization of the Class Entry
Before registering the class, it's necessary to define and initialize a class entry structure. This structure is temporary -- upon class registration, a new class entry structure is allocated. The initialization is done with one of these macros:
INIT_CLASS_ENTRY(class_container, class_name, functions) INIT_NS_CLASS_ENTRY(class_container, ns, class_name, functions) INIT_CLASS_ENTRY_EX(class_container, class_name, class_name_len, functions) INIT_OVERLOADED_CLASS_ENTRY(class_container, class_name, functions, handle_fcall, handle_propget, handle_propset) INIT_OVERLOADED_NS_CLASS_ENTRY(class_container, ns, class_name, functions, handle_fcall, handle_propget, handle_propset) INIT_OVERLOADED_CLASS_ENTRY_EX(class_container, class_name, class_name_len, functions, handle_fcall, handle_propget, handle_propset, handle_propunset, handle_propisset) INIT_OVERLOADED_NS_CLASS_ENTRY_EX(class_container, ns, class_name, functions, handle_fcall, handle_propget, handle_propset, handle_propunset, handle_propisset)
This is the meaning of the parameters:
class_container
- The temporaryzend_class_entry
to initialize.class_name
- The class name to expose in userspace (a string).functions
- Azend_function_entry
array terminated with an empty entry.class_name_len
- The length ofclass_name
, excluding the terminator.ns
- The namespace of the class. Don't prefix it with\
.handle_fcall
,handle_propget
,handle_propset
,handle_propunset
andhandle_propisset
- these arezend_function
pointers (orNULL
) that can be used to populate the respective fields in thezend_class_entry
structure.
Example:
static zend_function_entry myclass_functions[] = { PHP_ME(myclass, myMethod, NULL, 0), ... {NULL, NULL, NULL, 0, 0} } ... ZEND_MODULE_STARTUP_D(myext) { zend_class_entry ce; ... INIT_CLASS_ENTRY(ce, "MyClass", myclass_functions); ... }
Do not further modify the class entry. Other modifications should be made in the class entry returned upon class registration. In particular, setting the class flags (e.g. final) at this point will not work.
Class Registration
The registration step serves two purposes:
- Automates the definition of certain aspects of the class definition (the
zend_class_entry
structure) - Exposes the class to userspace.
The following functions/macros are available:
/* Functions */ zend_class_entry *zend_register_internal_class(zend_class_entry *class_entry TSRMLS_DC) zend_class_entry *zend_register_internal_class_ex(zend_class_entry *class_entry, zend_class_entry *parent_ce, char *parent_name TSRMLS_DC) zend_class_entry *zend_register_internal_interface( zend_class_entry *orig_class_entry TSRMLS_DC) int zend_register_class_alias_ex(const char *name, int name_len, zend_class_entry *ce TSRMLS_DC) /* Macros (they expand to zend_register_class_alias_ex, so return an int) */ zend_register_class_alias(name, ce) zend_register_ns_class_alias(ns, name, ce)
The functions/macros with “alias” in their name only expose the class to userspace; they do not change the class entry in any way. In general, these should be used only if the class entry pointed to by the argument was previously created with a zend_register_*
function.
The function zend_register_internal_class_ex
should be used when defining a subclass. If parent_ce
is given, the corresponding class will be used as the parent. If it is NULL
and parent_name
is not NULL
, the given superclass name will be resolved. If both are NULL
, it will behave like zend_register_internal_class
.
The zend_register_internal_*
classes execute these steps:
- Allocate a new class entry structure.
- Copy (in a shallow fashion) the passed class entry into the allocated one.
- Set its type to
ZEND_INTERNAL_CLASS
. - Initializes the new class entry structure by allocating and initializing its hash tables and resetting a few “scalar” fields (the magic methods set in the original class entry through
INIT_OVERLOADED_CLASS_ENTRY_EX
are not replaced). - Set the flags of the class entry to none or to
ZEND_ACC_INTERFACE
(according to the function called). - Convert the
zend_function_entry
structures intozend_function
's of typeZEND_INTERNAL_FUNCTION
. These functions are added to the call entry function table. If it finds methods that match the name of magic methods, the corresponding class entry fields are set. - If a parent is given, execute operations related to inheritance, e.g. copying inherited functions from the parent.
After class registration, the original zend_class_entry
variable should not be used anymore.
After registration, it's also possible to retrieve the zend_class_entry
variable through the class name. This is done with zend_lookup_class
:
int zend_lookup_class(const char *name, int name_length, zend_class_entry ***ce TSRMLS_DC);
Notice the triple (not double) indirection. In pratice, most extensions opt to use a global variable (and even export it for other extensions through the macro ZEND_API
) so as to avoid the performance penalty associated with the function call/hash table lookup.
Properties and Constants
In the module startup, after the PHP class is registered, it is time to add constants and properties.
For constants, the Zend API exposes the following functions:
int zend_declare_class_constant(zend_class_entry *ce, const char *name, size_t name_length, zval *value TSRMLS_DC) int zend_declare_class_constant_null(zend_class_entry *ce, const char *name, size_t name_length TSRMLS_DC) int zend_declare_class_constant_long(zend_class_entry *ce, const char *name, size_t name_length, long value TSRMLS_DC) int zend_declare_class_constant_bool(zend_class_entry *ce, const char *name, size_t name_length, zend_bool value TSRMLS_DC) int zend_declare_class_constant_double(zend_class_entry *ce, const char *name, size_t name_length, double value TSRMLS_DC) int zend_declare_class_constant_stringl(zend_class_entry *ce, const char *name, size_t name_length, const char *value, size_t value_length TSRMLS_DC) int zend_declare_class_constant_string(zend_class_entry *ce, const char *name, size_t name_length, const char *value TSRMLS_DC)
These all return SUCCESS
or FAILURE
. They are all straightforward to use, with the exception of zend_declare_class_constant
. The passed zval should be allocated with ALLOC_PERMANENT_ZVAL
(and then initialized and the intended value set).
For properties, the following functions are available:
int zend_declare_property(zend_class_entry *ce, char *name, int name_length, zval *property, int access_type TSRMLS_DC) int zend_declare_property_ex(zend_class_entry *ce, const char *name, int name_length, zval *property, int access_type, char *doc_comment, int doc_comment_len TSRMLS_DC) int zend_declare_property_null(zend_class_entry *ce, char *name, int name_length, int access_type TSRMLS_DC) int zend_declare_property_bool(zend_class_entry *ce, char *name, int name_length, long value, int access_type TSRMLS_DC); int zend_declare_property_long(zend_class_entry *ce, char *name, int name_length, long value, int access_type TSRMLS_DC); int zend_declare_property_double(zend_class_entry *ce, char *name, int name_length, double value, int access_type TSRMLS_DC); int zend_declare_property_string(zend_class_entry *ce, char *name, int name_length, char *value, int access_type TSRMLS_DC); int zend_declare_property_stringl(zend_class_entry *ce, char *name, int name_length, char *value, int value_len, int access_type TSRMLS_DC)
These are analogous to their zend_declare_class_constant*
counterparts, with the following differences:
- There is a
zend_declare_property_ex
that accepts a doc comment. The doc comment can be retrieved through reflection. - All functions access an access type.
The access type flags are taken from the ZEND_ACC_*
family. See under the ''zend_function_entry'' array section.
These apply to properties and can be set by the extension programmer:
ZEND_ACC_STATIC
- define a property as static.ZEND_ACC_PUBLIC
,ZEND_ACC_PROTECTED
,ZEND_ACC_PRIVATE
- only one of these these flags can be included. If none is included, it will default toZEND_ACC_PUBLIC
.
These are used internally and should not be passed to the functions above:
ZEND_ACC_CHANGED
- set in instance properties duplicated in the subclass properties where the correspondent superclass property a) hasZEND_ACC_CHANGED
, b) hasZEND_ACC_PRIVATE
, c) hasZEND_ACC_SHADOW
.ZEND_ACC_SHADOW
- set in instance properties copied from the superclass that are not duplicated in the subclass and which haveZEND_ACC_PRIVATE
orZEND_ACC_SHADOW
.ZEND_ACC_IMPLICIT_PUBLIC
- formerly (?) used for properties implicitly public (e.g. dynamic properties, i.e., undeclared instance properties).
Note that zend_declare_property(_ex)
also require a zval allocated with ALLOC_PERMANENT_ZVAL
.
Note also that interfaces cannot have properties and access level cannot be decreased in subclasses.
Other class definition tweaks
The class entry structure can be changed in other ways after registration.
See also iterators and serialization callbacks.
Create object handler
Almost all internal classes will want to replace the class entry's create_object
handler in order to be able to store arbitrary data in the object's data structure. See the section Data allocation and initialization for more on this.
Class flags
Class flags use the ZEND_ACC_*
macro family. See under the ''zend_function_entry'' array section. At this point, the class may already have the flag ZEND_ACC_INTERFACE
if you called zend_register_internal_interface
.
These can be set after class registration:
ZEND_ACC_FINAL_CLASS
,ZEND_ACC_EXPLICIT_ABSTRACT_CLASS
- define a class as final or static. It's unnecessary to explicitly setZEND_ACC_EXPLICIT_ABSTRACT_CLASS
if the class has (or inherits) abstract methods.ZEND_ACC_PUBLIC
,ZEND_ACC_PROTECTED
,ZEND_ACC_PRIVATE
- only one of these these flags can be included. If none is included, it will default toZEND_ACC_PUBLIC
.
These are automatically set by the engine and should not be set by the programmer:
ZEND_ACC_IMPLICIT_ABSTRACT_CLASS
- set automatically for classes that have abstract methods. Interfaces may have it too. Internal functions are also automatically givenZEND_ACC_ABSTRACT_CLASS
whenever an abstract method is found.ZEND_ACC_CLOSURE
- used internally for objects that are closures.ZEND_ACC_IMPLEMENT_INTERFACES
- the class implements one or more interfaces. See below.
Implement interfaces
A class may declare it implements one or more interfaces by calling the function:
void zend_class_implements(zend_class_entry *class_entry TSRMLS_DC, int num_interfaces, ...)
The ellipses represents one or more zend_class_entry *
variables that point to the class entries of the interfaces to be implemented.
Designing subclassable classes
Designing internal classes so that they can be extended on userspace is simple. The subclass will have the same class entry create_object
handler, not the default one which sets the standard object handlers and allocates a plain zend_object
. Therefore, if the internal class has a different handler table or its storage is a different data structure, that will not be a problem.
The only thing about which one must be careful is constructors. Subclasses may define a new constructor that does not call the parent constructor. If the internal class relies on the constructor to set a consistent internal state, it can be changed in the following alternative ways:
- Moving the necessary initialization to the
create_object
class entry handler. - Overriding the
get_constructor
handler. It could, for example, be modified to always return a function that does the necessary initializations, calls the defaultget_constructor
handler (zend_std_get_constructor
) and then executes the returned constructor, if any.
Often, the internal constructor requires several arguments to be passed. The constructor for the subclass may be defined so that it takes less or different arguments. This is clearly a problem that cannot be handled by the first approach. The second one can at least fail if not enough arguments are given or bad arguments are given, but even that's not a very good idea, because the arguments, even being apparently correct, may have different semantics. In sum, if the construction requires arguments, there is no good solution except requiring the super constructor to be called. This can be accomplished this way:
static zend_object_handlers object_handlers; static zend_class_entry *ce_ptr; static zend_function constr_wrapper_fun; typedef struct test_object { zend_object std; zend_bool constructed; /* TestClass constructor was called? */ /* more properties follow */ ... } test_object; static zend_object_value ce_create_object(zend_class_entry *class_type TSRMLS_DC) { zend_object_value zov; test_object *tobj; tobj = emalloc(sizeof *tobj); zend_object_std_init((zend_object *) tobj, class_type TSRMLS_CC); tobj->constructed = 0; #if PHP_VERSION_ID < 50399 zend_hash_copy(tobj->std.properties, &(class_type->default_properties), (copy_ctor_func_t) zval_add_ref, NULL, sizeof(zval*)); #else object_properties_init(&tobj->std, class_type); #endif zov.handle = zend_objects_store_put(tobj, (zend_objects_store_dtor_t) zend_objects_destroy_object, (zend_objects_free_object_storage_t) zend_objects_free_object_storage, NULL TSRMLS_CC); zov.handlers = &object_handlers; return zov; } PHP_METHOD(testclass, __construct) { zval *this = getThis(); test_object *tobj = zend_object_store_get_object(this TSRMLS_CC); assert(tobj != NULL); tobj->constructed = (zend_bool) 1; ... /* if there's an error that leaves the object in an invalid state and * you have to throw an exception, also destroy the $this reference. * The reason is that the exception may be caught in the constructor * of the child class that's calling this constructor. */ if (bad_thing_happened()) { /* Destroying only the $this reference will cause the object to leak; * it will be destroyed on request shutdown, but you can prevent that * by also destroying the object with: * zval_dtor(this); * But beware this will call both the destroy_object and the * free_object handlers. If you want only the second to be called, * you can call zend_object_store_ctor_failed() before */ ZVAL_NULL(this); } } static zend_function *get_constructor(zval *object TSRMLS_DC) { /* Could always return constr_wrapper_fun, but it's uncessary to call the * wrapper if instantiating the superclass */ if (Z_OBJCE_P(object) == ce_ptr) return zend_get_std_object_handlers()-> get_constructor(object TSRMLS_CC); else return &constr_wrapper_fun; } static void construction_wrapper(INTERNAL_FUNCTION_PARAMETERS) { zval *this = getThis(); test_object *tobj; zend_class_entry *this_ce; zend_function *zf; zend_fcall_info fci = {0}; zend_fcall_info_cache fci_cache = {0}; zval *retval_ptr = NULL; unsigned i; tobj = zend_object_store_get_object(this TSRMLS_CC); zf = zend_get_std_object_handlers()->get_constructor(this TSRMLS_CC); this_ce = Z_OBJCE_P(this); fci.size = sizeof(fci); fci.function_table = &this_ce->function_table; fci.object_ptr = this; /* fci.function_name = ; not necessary to bother */ fci.retval_ptr_ptr = &retval_ptr; fci.param_count = ZEND_NUM_ARGS(); fci.params = emalloc(fci.param_count * sizeof *fci.params); /* Or use _zend_get_parameters_array_ex instead of loop: */ for (i = 0; i < fci.param_count; i++) { fci.params[i] = (zval **) (zend_vm_stack_top(TSRMLS_C) - 1 - (fci.param_count - i)); } fci.object_ptr = this; fci.no_separation = 0; fci_cache.initialized = 1; fci_cache.called_scope = EG(current_execute_data)->called_scope; fci_cache.calling_scope = EG(current_execute_data)->current_scope; fci_cache.function_handler = zf; fci_cache.object_ptr = this; zend_call_function(&fci, &fci_cache TSRMLS_CC); if (!EG(exception) && tobj->constructed == 0) zend_throw_exception(NULL, "parent::__construct() must be called in " "the constructor.", 0 TSRMLS_CC); efree(fci.params); zval_ptr_dtor(&retval_ptr); } static zend_function_entry ext_class_methods[] = { PHP_ME(testclass, __construct, 0, ZEND_ACC_PUBLIC) ... {NULL, NULL, NULL, 0, 0} } ZEND_MODULE_STARTUP_D(testext) { zend_class_entry ce; memcpy(&object_handlers, zend_get_std_object_handlers(), sizeof object_handlers); object_handlers.get_constructor = get_constructor; object_handlers.clone_obj = NULL; INIT_CLASS_ENTRY(ce, "TestClass", ext_class_methods); ce_ptr = zend_register_internal_class(&ce TSRMLS_CC); ce_ptr->create_object = ce_create_object; constr_wrapper_fun.type = ZEND_INTERNAL_FUNCTION; constr_wrapper_fun.common.function_name = "internal_construction_wrapper"; constr_wrapper_fun.common.scope = ce_ptr; constr_wrapper_fun.common.fn_flags = ZEND_ACC_PROTECTED; constr_wrapper_fun.common.prototype = NULL; constr_wrapper_fun.common.required_num_args = 0; constr_wrapper_fun.common.arg_info = NULL; #if PHP_VERSION_ID < 50399 /* moved to common.fn_flags with rev 303381 */ constr_wrapper_fun.common.pass_rest_by_reference = 0; constr_wrapper_fun.common.return_reference = 0; #endif constr_wrapper_fun.internal_function.handler = construction_wrapper; constr_wrapper_fun.internal_function.module = EG(current_module); return SUCCESS; }
Another option is to check on every internal method call whether the native structure has been properly initialized by the native constructor. Since most instance methods will need to fetch the object, this is a good opportunity to do the check. For instance, the cairo extension does this:
static inline cairo_surface_object* cairo_surface_object_get(zval *zobj TSRMLS_DC) { cairo_surface_object *pobj = zend_object_store_get_object(zobj TSRMLS_CC); if (pobj->surface == NULL) { php_error(E_ERROR, "Internal surface object missing in %s wrapper, you must call parent::__construct in extended classes", Z_OBJCE_P(zobj)->name); } return pobj; }
This has two disadvantages relatively to the previous method:
- It defers the check until an instance method is called, instead of immediately when the problem occurs (when the user-land constructor doesn't call the parent native constructor).
- The check is made on every method call, instead of only once.
However, this is by far a more popular approach, since it's simple and portable -- it uses only stable parts of the API.
A variant of this strategy is to centralize the object state validation in the get_method
handler and either throw a fatal error or return a method that throws an exception from the handler in case the object state is invalid. This makes it easier to fix current code without replacing the calls to zend_object_store_get_object
in every method implementation.
Finally, another option, certainly less complex but more limiting, is to make the superclass constructor final.
Iterators
TODO
Serialization callbacks
TODO
Object creation and destruction
Object creation involves these steps:
- Allocate and initialize the underlying data structure
- Store the object
- Build a reference to the object
- (optional) Call the constructor
Calling the constructor is uncommon internally because there are easier ways to initialize the object (calling a zend_function
is verbose). The initialization steps that are common to all the objects of a given type can be done in step 1. The initialization of a particular instance (which e.g. depends on some other data, the kind of data that would be passed to a constructor) can be done in a separate auxiliary C function. Every time an object is instantiated internally, the programmer must also call this function to do instance-specific initialization. A constructor is still necessary to properly support the new
operator, but this strategy does not imply duplication of code -- the internal implementation of the constructor may rely on the same auxiliary function.
Data allocation and initialization
In general, this part is completely domain dependent. The programmer may allocate and initialize an object however he wants.
However, zend standard objects (those with a class entry) rely on the class entry's create_object
handler. Typically, these have a data structure whose pointer can be passed to functions that expect zend_object*
. Hence, the typical class entry create_object
handler will look like test_create_object
in the example below:
typedef struct test_object { zend_object std; /* more properties follow */ ... } test_object; static zend_object_handlers object_handlers; static zend_object_value test_create_object(zend_class_entry *class_type TSRMLS_DC) { zend_object_value zov; test_object *tobj; tobj = emalloc(sizeof *tobj); zend_object_std_init((zend_object *) tobj, class_type TSRMLS_CC); #if PHP_VERSION_ID < 50399 zend_hash_copy(tobj->std.properties, &(class_type->default_properties), (copy_ctor_func_t) zval_add_ref, NULL, sizeof(zval*)); #else object_properties_init((zend_object*)tobj, class_type); #endif /* The destroy and free callbacks should be replaced if necessary */ zov.handle = zend_objects_store_put(tobj, (zend_objects_store_dtor_t) zend_objects_destroy_object, (zend_objects_free_object_storage_t) zend_objects_free_object_storage, NULL TSRMLS_CC); /* other specific stuff */ ... zov.handlers = &object_handlers; return zov; } ZEND_MODULE_STARTUP_D(testext) { zend_class_entry ce; zend_class_entry *ce_ptr; memcpy(&object_handlers, zend_get_std_object_handlers(), sizeof object_handlers); /* replace necessary handlers */ ... INIT_CLASS_ENTRY(ce, "TestClass", ext_class_methods); ce_ptr = zend_register_internal_class(&ce TSRMLS_CC); ce_ptr->create_object = ce_create_object; /* Other startup stuff */ ... return SUCCESS; }
The create_object
handler can also be NULL
, in which case the general operations listed in `test_create_object` are executed except a vanilla zend_object
structure is initialized (instead of a test_object
).
Object storage
Objects are accessed through their references, the only thing linking the references to object instances is a integer (the object handle). This handle is a key that allows access to the object data structure. How this is done depends entirely on the type of the object.
Of particular relevance, are, of course, zend standard objects. These are stored in the 'objects store
'. The zend objects API exposes these functions:
/* Storage */ typedef void (*zend_objects_store_dtor_t)(void *object, zend_object_handle handle TSRMLS_DC); typedef void (*zend_objects_free_object_storage_t)(void *object TSRMLS_DC); typedef void (*zend_objects_store_clone_t)(void *object, void **object_clone TSRMLS_DC); zend_object_handle zend_objects_store_put(void *object, zend_objects_store_dtor_t dtor, zend_objects_free_object_storage_t storage, zend_objects_store_clone_t clone TSRMLS_DC); /* Retrieval */ void *zend_object_store_get_object(const zval *object TSRMLS_DC); void *zend_object_store_get_object_by_handle(zend_object_handle handle TSRMLS_DC); /* refcount related */ void zend_objects_store_add_ref(zval *object TSRMLS_DC); void zend_objects_store_del_ref(zval *object TSRMLS_DC); void zend_objects_store_add_ref_by_handle(zend_object_handle handle TSRMLS_DC); oid zend_objects_store_del_ref_by_handle_ex(zend_object_handle handle, const zend_object_handlers *handlers TSRMLS_DC); void zend_objects_store_del_ref_by_handle(zend_object_handle handle TSRMLS_DC); zend_uint zend_objects_store_get_refcount(zval *object TSRMLS_DC); /* Misc */ zend_object_value zend_objects_store_clone_obj(zval *object TSRMLS_DC); /* zend_object_store_set_object: * It is ONLY valid to call this function from within the constructor of an * overloaded object. Its purpose is to set the object pointer for the object * when you can't possibly know its value until you have parsed the arguments * from the constructor function. You MUST NOT use this function for any other * weird games, or call it at any other time after the object is constructed. * (This is rarely used) */ void zend_object_store_set_object(zval *zobject, void *object TSRMLS_DC); /* Called when the constructor was terminated by an exception. Prevents the * "destroy object" store callback from being called */ void zend_object_store_ctor_failed(zval *zobject TSRMLS_DC); /* Used to destroy all the objects in the store */ void zend_objects_store_free_object_storage(zend_objects_store *objects TSRMLS_DC);
The objects store can actually store any type of data structures; the data structure doesn't have to be an extension of zend_object
. The header file zend_objects.h
provides some functions to deal exclusively with zend standard objects:
/* To be used in the create_object class entry handler to initialize the * zend_object structure */ void zend_object_std_init(zend_object *object, zend_class_entry *ce TSRMLS_DC); /* Despite the name, this is actually related to object freeing. It frees all * the memory used by the inner structures of zend_object */ void zend_object_std_dtor(zend_object *object TSRMLS_DC); /* The default implementation of the create_object handler */ zend_object_value zend_objects_new(zend_object **object, zend_class_entry *class_type TSRMLS_DC); /* The default implementation of the free object store callback. Calls * the PHP destructor, if any. */ void zend_objects_destroy_object(zend_object *object, zend_object_handle handle TSRMLS_DC); /* Alias of zend_object_store_get_object, except it returns a zend_object * pointer instead of void* */ zend_object *zend_objects_get_address(const zval *object TSRMLS_DC); /* Copies the properties of the old_object and calls the class entry * clone handler. Used in the implementation of zend_objects_clone_obj * In PHP > 5.3, it also initializes the properties before. */ void zend_objects_clone_members(zend_object *new_object, zend_object_value new_obj_val, zend_object *old_object, zend_object_handle handle TSRMLS_DC); /* Allocates a new object with zend_objects_new and clones the members. * It's the default implementation of the clone object handler. The fact * it uses zend_objects_new means you almost certainly will want to replace * the clone object handler when implementing internal classes. */ zend_object_value zend_objects_clone_obj(zval *object TSRMLS_DC); /* default implementation of the free storage store callback. * Calls zend_object_std_dtor and then frees the object itself */ void zend_objects_free_object_storage(zend_object *object TSRMLS_DC);
The function zend_objects_store_put
adds an object to the store. This is the function that must be called during the creation of the object, as exemplified in the listing of the section before. All of the three last arguments may be NULL
.
'Destructor
': IfNULL
,zend_objects_destroy_object
is used instead, which calls the PHP destructor, if any. This is called prior to the “free storage” callback when destroying the object. Cleanup of the memory allocated to the object data structures is left to the “free storage” callback. This callback is not called if the object construction failed. If passing a custom store destructor callback, calling the PHP destructor can be delegated tozend_objects_destroy_object
.'Free storage
': Used to free the object data structures. For vanilla zend objects, this should bezend_objects_free_object_storage
; if extending zend standard objects, in the custom callback one should delegate tozend_objects_free_object_storage
the cleanup of thezend_object
field and of the outer object data structure (hence, the call tozend_objects_free_object_storage
should be the last thing). There's no default ifNULL
is specified.'Clone
': Most likely, this should beNULL
. One should only use this callback if implementing objects without class entries and usingzend_objects_store_clone_obj
as aclone_obj
handler. Then, that function will call this callback, which should allocate a new object, use the passed double indirection pointer to store a pointer to it, and clone the passed object into this new one.
After the call to zend_objects_store_put
, the object will have reference count = 1 in the store.
Object reference creation
If we're creating a zend standard object, the create_object
handler already returned a zend_object_value
. The creation of an object reference zval is handled automatically by the new
operator.
To instantiate new objects internally, the following macros are available:
int object_init(zval *arg); int object_init_ex(zval *arg, zend_class_entry *ce); /* This function requires 'properties' to contain all props declared in the * class and all props being public. If only a subset is given or the class * has protected members then you need to merge the properties separately by * calling zend_merge_properties(). */ int object_and_properties_init(zval *arg, zend_class_entry *ce, HashTable *properties); /* This function should be called after the constructor has been called * because it may call __set from the uninitialized object otherwise. */ void zend_merge_properties(zval *obj, HashTable *properties, int destroy_ht TSRMLS_DC);
These all take an allocated and initialized (INIT_ZVAL
) or partially initialized (INIT_PZVAL
) zval
pointer. object_init
is not particularly useful, since it will instantiate a stdClass
object. object_and_properties_init
also allows efficient initialization of the object properties, but it has the limitations indicated in the comments. If all instances are to be initialized with the same property values, the default property values, defined when the class is registered, should be used instead.
Flow of the construction/destruction process
This is an overview of the process of object construction for zend standard object derivatives.
For internal instantiations:
- Allocate and (partially) initialize a
zval *
. - Call
object_init_ex
. A pointer to the class entry should be available.- Call the class entry's
create_object
handler.- Allocate the object structure. This structure's first field should be a
zend_object
. - Call
zend_object_std_init
to initialize thezend_object
part of the object data. - Copy the default properties from class entry into the properties hash table of the new object. In PHP >= 5.3.99,
object_properties_init
should be called instead because non-dynamic properties are stored in C arrays instead of the properties hash table (though the hash table is still used when it's requested or when there are dynamic properties). - Call
zend_objects_store_put
, passing a custom “destroy object” callback which does cleanup specific to properly constructed objects and a custom “destroy object” callback that frees all the memory and other resources taken by the object (which is always called). - Assign the return value of
zend_objects_store_put
to thezend_object_value
that is to be returned. - Set the field
handlers
ofzend_object_value
that's to be returned to the appropriate object handlers table.
- Set the zval type to
IS_OBJECT
and the value to that returned by thecreate_object
handler.
- Do post-creation initialization on the new objected (the construction phase), typically through an auxiliary function.
For instantiations with new
:
- Call the class entry's
create_object
handler.- (see above)
- ...
- Call the PHP constructor, if any. Typically, the internal implementation of the constructor delegates the construction task to the same auxiliary function referred to in the last step of the list before.
Stored objects should not be destroyed explicitly; in fact, the store API doesn't even expose a function to destroy a particular object. Instead, the destruction should be managed through the refcount. When the reference count hits 0, the store will call the object “destruct” store handler (if the object construction didn't fail) and the “free object” handler and remove the entry from its table. See also the add_ref
and del_ref
handlers.