phpng-upgrading

Upgrading PHP extensions from PHP5 to NG

Many of the frequently used API functions have changed, such as the HashTable API; this page intends to document as many as possible of those changes that actually affect the way extension and core code is written. It's highly recommended to read the general information about PHPNG implementation at phpng-int, before reading this guide.

This is not a complete guide that covers every possible situation. This is a collection of prescriptions for most useful cases. I hope it must be enough for most user-level extensions. However if you did not find some information here, found a solution and think it may be useful for others - feel free to add your recipe.

General Advice

  • Try to compile your extension with PHPNG. Look into compilation error and warnings. They must show you 75% of the places that have to be changed.
  • Compile and test extensions in debug mode (configure PHP with –enable-debug). It'll enable catching some error in run-time using assert(). You'll also see information about memory leaks.

zval

  • PHPNG doesn't require any involvement of pointers to pointers to zval. Most occurrences of zval** variables and parameters have to be changes into zval*. The corresponding Z_*_PP() macros that work with such variables should be changed into Z_*_P().
  • In many places PHPNG work with zval directly (eliminating need for allocation and deallocation). In these cases corresponding zval* variable should be converted into plain zval, macros that use this variable from Z_*P() into Z_*() and corresponding creation macros from ZVAL_*(var, …) into ZVAL_*(&var, …). Be always careful about passing addresses of zval and & operator. PHPNG almost never require passing address of zval*. In some places & operator should be removed.
  • zval allocation macros ALLOC_ZVAL, ALLOC_INIT_ZVAL, MAKE_STD_ZVAL are removed. In most cases their usage indicate that zval* need to be changed into plain zval. Macro INIT_PZVAL is removed as well and its usages in most cases should be just removed.
-  zval *zv;
-  ALLOC_INIT_ZVAL();
-  ZVAL_LONG(zv, 0);
+  zval zv;
+  ZVAL_LONG(&zv, 0);
  • The zval struct has completely changed. Now it's defined as:
struct _zval_struct {
	zend_value        value;			/* value */
	union {
		struct {
			ZEND_ENDIAN_LOHI_4(
				zend_uchar    type,			/* active type */
				zend_uchar    type_flags,
				zend_uchar    const_flags,
				zend_uchar    reserved)	    /* various IS_VAR flags */
		} v;
		zend_uint type_info;
	} u1;
	union {
		zend_uint     var_flags;
		zend_uint     next;                 /* hash collision chain */
		zend_uint     str_offset;           /* string offset */
		zend_uint     cache_slot;           /* literal cache slot */
	} u2;
};

and zend_value as

typedef union _zend_value {
	long              lval;				/* long value */
	double            dval;				/* double value */
	zend_refcounted  *counted;
	zend_string      *str;
	zend_array       *arr;
	zend_object      *obj;
	zend_resource    *res;
	zend_reference   *ref;
	zend_ast_ref     *ast;
	zval             *zv;
	void             *ptr;
	zend_class_entry *ce;
	zend_function    *func;
} zend_value;
  • The main difference is that now we handle scalar and complex types differently. PHP doesn't allocate scalar values in heap but do it directly on VM stack, inside HashTables and object. They are not subjects for reference counting and garbage collection anymore. Scalar values don't have reference counter and don't support Z_ADDREF*(), Z_DELREF*(), Z_REFCOUNT*() and Z_SET_REFCOUNT*() macros anymore. In most cases you should check if zval supports these macros before calling them. Otherwise you'll get an assert() or crash.
- Z_ADDREF_P(zv)
+ if (Z_REFCOUNTED_P(zv)) {Z_ADDREF_P(zv);}
# or equivalently
+ Z_TRY_ADDREF_P(zv);
  • zval values should be copied using ZVAL_COPY_VALUE() macro
  • It's possible to copy and increment reference counter if necessary using ZVAL_COPY() macro
  • Duplication of zval (zval_copy_ctor) may be done using ZVAL_DUP() macro
  • If you converted a zval* into a zval and previously used NULL to indicate an undefined value, you can now use the IS_UNDEF type instead. It can be set using ZVAL_UNDEF(&zv) and checked using if (Z_ISUNDEF(zv)).
  • If you want to get the long/double/string value of a zval using cast-semantics without modifying the original zval you can now use the zval_get_long(zv), zval_get_double(zv) and zval_get_string(zv) APIs to simplify the code:
- zval tmp;
- ZVAL_COPY_VALUE(&tmp, zv);
- zval_copy_ctor(&tmp);
- convert_to_string(&tmp);
- // ...
- zval_dtor(&tmp);
+ zend_string *str = zval_get_string(zv);
+ // ...
+ zend_string_release(str);

Look into zend_types.h code for more details: https://github.com/php/php-src/blob/master/Zend/zend_types.h

References

zval in PHPNG don't have is_ref flag anymore. References are implemented using a separate complex reference-counted type IS_REFERENCE. You may still use Z_ISREF*() macros to check if the given zval is reference. Actually, it just checks if type of the given zval equal to IS_REFERENCE. Macros that worked with is_ref flag are removed: Z_SET_ISREF*(), Z_UNSET_ISREF*() and Z_SET_ISREF_TO*(). Their usage should be changed in the following way:

- Z_SET_ISREF_P(zv);
+ ZVAL_MAKE_REF(zv);

- Z_UNSET_ISREF_P(zv);
+ if (Z_ISREF_P(zv)) {ZVAL_UNREF(zv);}

Previously references might be directly checked for referenced type. Now we have to check it indirectly through Z_REFVAL*() macro

- if (Z_ISREF_P(zv) && Z_TYPE_P(zv) == IS_ARRAY) {
+ if (Z_ISREF_P(zv) && Z_TYPE_P(Z_REFVAL_P(zv)) == IS_ARRAY) {

or perform manual dereferencing using ZVAL_DEREF() macro

- if (Z_ISREF_P(zv)) {...}
- if (Z_TYPE_P(zv) == IS_ARRAY) {
+ if (Z_ISREF_P(zv)) {...}
+ ZVAL_DEREF(zv);
+ if (Z_TYPE_P(zv) == IS_ARRAY) {

Booleans

IS_BOOL does not exist anymore but IS_TRUE and IS_FALSE are types on their own:

- if ((Z_TYPE_PP(item) == IS_BOOL || Z_TYPE_PP(item) == IS_LONG) && Z_LVAL_PP(item)) {
+ if (Z_TYPE_P(item) == IS_TRUE || (Z_TYPE_P(item) == IS_LONG && Z_LVAL_P(item))) {

The Z_BVAL*() macros are removed. Be careful, the return value of Z_LVAL*() on IS_FALSE/IS_TRUE is undefined.

Strings

The value/length of the string may be accessed using the same macros Z_STRVAL*() and Z_STRLEN*(). However now the underlining data structure for string representation is zend_string (it's described in separate section). The zend_string may be retrieved from zval by Z_STR*() macro. It's also possible to get the hash value of the string through Z_STRHASH*().

In case code needs to check if the given string is interned or not, now it should be done using zend_string (not char*)

- if (IS_INTERNED(Z_STRVAL_P(zv))) {
+ if (IS_INTERNED(Z_STR_P(zv))) {

Creation of string zvals was a little bit changed. Previously macros like ZVAL_STRING() had an additional argument that told if the given characters should be duplicated or not. Now these macros always have to create zend_string structure so this parameter became useless. However if its actual value was 0, you have free the original string to avoid memory leak.

- ZVAL_STRING(zv, str, 1);
+ ZVAL_STRING(zv, str);

- ZVAL_STRINGL(zv, str, len, 1);
+ ZVAL_STRINGL(zv, str, len);

- ZVAL_STRING(zv, str, 0);
+ ZVAL_STRING(zv, str);
+ efree(str);

- ZVAL_STRINGL(zv, str, len, 0);
+ ZVAL_STRINGL(zv, str, len);
+ efree(str);

The same is true for similar macros like RETURN_STRING(), RETVAL_STRNGL(), etc and some internal API functions.

- add_assoc_string(zv, key, str, 1);
+ add_assoc_string(zv, key, str);

- add_assoc_string(zv, key, str, 0);
+ add_assoc_string(zv, key, str);
+ efree(str);

The double reallocation may be avoided using zend_string API directly and creating zval directly from zend_string.

- char * str = estrdup("Hello");
- RETURN_STRING(str);
+ zend_string *str = zend_string_init("Hello", sizeof("Hello")-1, 0);
+ RETURN_STR(str);

Z_STRVAL*() now should be used as read-only object. It's not possible to assign anything into it. It's possible to modify spearate characters, but before doing it you must be sure that this string is not referred form everywhere else (it is not interned and its reference-counter is 1). Also after in-place string modification you might need to reset calculated hash value.

  SEPARATE_ZVAL(zv);
  Z_STRVAL_P(zv)[0] = Z_STRVAL_P(zv)[0] + ('A' - 'a');
+ zend_string_forget_hash_val((Z_STR_P(zv))

zend_string API

Zend has a new zend_string API, except that zend_string is underlining structure for string representation in zval, these structures are also used throughout much of the codebase where char* and int were used before.

zend_strings (not IS_STRING zvals) may be created using zend_string_init(char *val, int len, int persistent) function. The actual characters may be accessed as str→val and string length as str→len. The hash value of the string should be accessed through zend_string_hash_val function. It'll re-calculate hash value if necessary.

Strings should be deallocated using zend_string_release() function, that doesn't necessary free memory, because the same string may be referenced from few places.

If you are going to keep zend_string pointer somewhere you should increase it reference-counter or use zend_string_copy() function that will do it for you. In many places where code copied characters just to keep value (not to modify) it's possible to use this function instead.

- ptr->str = estrndup(Z_STRVAL_P(zv), Z_STRLEN_P(zv));
+ ptr->str = zend_string_copy(Z_STR_P(zv));
  ...
- efree(str);
+ zend_string_release(str);

In case the copied string is going to be changed you may use zend_string_dup() instead

- char *str = estrndup(Z_STRVAL_P(zv), Z_STRLEN_P(zv));
+ zend_string *str = zend_string_dup(Z_STR_P(zv));
  ...
- efree(str);
+ zend_string_release(str);

The code with old macros must be supported as well, so switching to the new ones is not necessary.

In some cases it makes sense to allocate string buffer before the actual string data is known. You may use zend_string_alloc() and zend_string_realloc() functions to do it.

- char *ret = emalloc(16+1);
- md5(something, ret); 
- RETURN_STRINGL(ret, 16, 0);
+ zend_string *ret = zend_string_alloc(16, 0);
+ md5(something, ret->val);
+ RETURN_STR(ret);

Not all of the extensions code have to be converted to use zend_string instead of char*. It's up to extensions maintainer to decide which type is more suitable in each particular case.

Look into zend_string.h code for more details: https://github.com/php/php-src/blob/master/Zend/zend_string.h

smart_str and smart_string

For consistent naming convention the old smart_str API was renamed into smart_string. it may be used as before except for new names.

- smart_str str = {0};
- smart_str_appendl(str, " ", sizeof(" ") - 1);
- smart_str_0(str);
- RETURN_STRINGL(implstr.c, implstr.len, 0);
+ smart_string str = {0};
+ smart_string_appendl(str, " ", sizeof(" ") - 1);
+ smart_string_0(str);
+ RETVAL_STRINGL(str.c, str.len);
+ smart_string_free(&str);

In addition we introduced a new zend_str API that works with zend_string directly

- smart_str str = {0};
- smart_str_appendl(str, " ", sizeof(" ") - 1);
- smart_str_0(str);
- RETURN_STRINGL(implstr.c, implstr.len, 0);
+ smart_str str = {0};
+ smart_str_appendl(str, " ", sizeof(" ") - 1);
+ smart_str_0(str);
+ if (str.s) {
+   RETURN_STR(str.s);
+ } else {
+   RETURN_EMPTY_STRING();
+ }

smart_str defined as

typedef struct {
    zend_string *s;
    size_t a;
} smart_str;

The API of both smart_str and smart_string are very similar and actually they repeat the API used in PHP5. So it must not be a big problems to adopt the code. the biggest question what AI to select for each particular case, but it depends the way the final result is used.

Note that the previously check for a empty smart_str might need to be changed

- if (smart_str->c) {
+ if (smart_str->s) {

strpprintf

In addition to spprintf() and vspprintf() functions we introduced similar functions that produce zend_string instead char*. it's up to you to decide when you should change to the new variants.

PHPAPI zend_string *vstrpprintf(size_t max_len, const char *format, va_list ap);
PHPAPI zend_string *strpprintf(size_t max_len, const char *format, ...);

Arrays

Arrays implemented more or less the same, however, if previously the underlining structure was imlemented as a pointer to HashTable now we have here a pointer to zend_array that keep HashTable inside. The HashTable may be read as before using Z_ARRVAL*() macros, but now it's not possible to change pointer to HashTable. It's only possible to get/set pointer to the whole zend_array through macro Z_ARR*().

The best way to create arrays is to use old array_init() function, but it's also possible to create new uninitialized arrays using ZVAL_NEW_ARR() or initialize it using zend_array structure through ZVAL_ARR()

Some arrays might be immutable (may be checked using Z_IMMUTABLE() macro). And in case code need to modify them, they have to be duplicated first. Iteration through immutable arrays using internal position pointer is not possible as well. It's possible to walk through such arrays using old iteration API with external position pointer or using new HashTable iteration API described in separate section.

HashTable API

HashTable API was changed significantly, and it may cause some troubles in extensions porting.

  • First of all now HashTables always keep zvals. Even if we store an arbitrary pointer, it's packed into zval with special type IS_PTR. Anyway, this simplifies work with zval
- zend_hash_update(ht, Z_STRVAL_P(key), Z_STRLEN_P(key)+1, (void*)&zv, sizeof(zval**), NULL) == SUCCESS) {
+ if (zend_hash_update(EG(function_table), Z_STR_P(key), zv)) != NULL) {
  • Most API functions returns requested values directly (instead of using additional by reference argument and returning SUCCESS/FAILURE).
- if (zend_hash_find(ht, Z_STRVAL_P(key), Z_STRLEN_P(key)+1, (void**)&zv_ptr) == SUCCESS) {
+ if ((zv = zend_hash_find(ht, Z_STR_P(key))) != NULL) {
  • Keys are represented as zend_string. Most functions have two forms. One receives a zend_string as key and the other a char*, length pair.
  • Important Note: Length of the key string does not include trailing zero. In some places +1/-1 has to be removed/added:
- if (zend_hash_find(ht, "value", sizeof("value"), (void**)&zv_ptr) == SUCCESS) {
+ if ((zv = zend_hash_str_find(ht, "value", sizeof("value")-1)) != NULL) {

This also applies to other hashtable-related APIs outside of zend_hash. For example:

- add_assoc_bool_ex(&zv, "valid", sizeof("valid"), 0);
+ add_assoc_bool_ex(&zv, "valid", sizeof("valid") - 1, 0);
  • API provides a separate group of functions to work with arbitrary pointers. Such functions have the same names with _ptr suffix.
- if (zend_hash_find(EG(class_table), Z_STRVAL_P(key), Z_STRLEN_P(key)+1, (void**)&ce_ptr) == SUCCESS) {
+ if ((ce_ptr = zend_hash_find_ptr(EG(class_table), Z_STR_P(key))) != NULL) {

- zend_hash_update(EG(class_table), Z_STRVAL_P(key), Z_STRLEN_P(key)+1, (void*)&ce, sizeof(zend_class_entry*), NULL) == SUCCESS) {
+ if (zend_hash_update_ptr(EG(class_table), Z_STR_P(key), ce)) != NULL) {
  • API provides a separate group of functions to store memory blocks of arbitrary size. Such functions have the same names with _mem suffix and they implemented as inline wrappers of corresponding _ptr functions. It doesn't mean if something was stored using _mem or _ptr variant. It always may be retrieved back using zend_hash_find_ptr().
- zend_hash_update(EG(function_table), Z_STRVAL_P(key), Z_STRLEN_P(key)+1, (void*)func, sizeof(zend_function), NULL) == SUCCESS) {
+ if (zend_hash_update_mem(EG(function_table), Z_STR_P(key), func, sizeof(zend_function))) != NULL) {
  • few new optimized functions for new element insertion were added. They are intended to be used in situations when code adds only new elements, that can't overlap with already existing keys. For example when you copy some elements of one HashTable into a new one. All such functions have _new suffix.
zval* zend_hash_add_new(HashTable *ht, zend_string *key, zval *zv);
zval* zend_hash_str_add_new(HashTable *ht, char *key, int len, zval *zv);
zval* zend_hash_index_add_new(HashTable *ht, pzval *zv);
zval* zend_hash_next_index_insert_new(HashTable *ht, pzval *zv);
void* zend_hash_add_new_ptr(HashTable *ht, zend_string *key, void *pData);
...
  • HashTable destructors now always receive zval* (even if we use zend_hash_add_ptr or zend_hash_add_mem to add elements). Z_PTR_P() macro may be used to reach the actual pointer value in destructors. Also, if elements are added using zend_hash_add_mem, destructor is also responsible for deallocation of the pointers themselves.
- void my_ht_destructor(void *ptr)
+ void my_ht_destructor(zval *zv)
  {
-    my_ht_el_t *p = (my_ht_el_t*) ptr;
+    my_ht_el_t *p = (my_ht_el_t*) Z_PTR_P(zv);
     ...
+    efree(p); // this efree() is not always necessary
  }
);
  • Callbacks for all zend_hash_apply_*() functions, as well as callbacks for zend_hash_copy() and zend_hash_merge(), should be changed to receive zval* instead of void*&& in the same way as destructors. Some of these functions also receive pointer to zend_hash_key structure. It's definition was changed in the following way. For string keys h contains a value of hash function and key the actual string. For integer keys h contains numeric key value, and key is NULL.
typedef struct _zend_hash_key {
	ulong        h;
	zend_string *key;
} zend_hash_key;

In some cases, it makes sense to change usage of zend_hash_apply*() functions into usage of new HashTable iteration API. This may lead to smaller and more efficient code.

Reviewing zend_hash.h is a very good idea: https://github.com/php/php-src/blob/master/Zend/zend_hash.h

HashTable Iteration API

We provide few specialized macros to iterate through elements (and keys) of HashTables. The first argument of the macros is the hashtables, the others are variables to be assigned on each iteration step.

  • ZEND_HASH_FOREACH_VAL(ht, val)
  • ZEND_HASH_FOREACH_KEY(ht, h, key)
  • ZEND_HASH_FOREACH_PTR(ht, ptr)
  • ZEND_HASH_FOREACH_NUM_KEY(ht, h)
  • ZEND_HASH_FOREACH_STR_KEY(ht, key)
  • ZEND_HASH_FOREACH_STR_KEY_VAL(ht, key, val)
  • ZEND_HASH_FOREACH_KEY_VAL(ht, h, key, val)

The best suitable macro should be used instead of the old reset, current, and move functions.

- HashPosition pos;
  ulong num_key;
- char *key;
- uint key_len;
+ zend_string *key;
- zval **pzv;
+ zval *zv;
-
- zend_hash_internal_pointer_reset_ex(&ht, &pos);
- while (zend_hash_get_current_data_ex(&ht, (void**)&ppzval, &pos) == SUCCESS) {
-   if (zend_hash_get_current_key_ex(&ht, &key, &key_len, &num_key, 0, &pos) == HASH_KEY_IS_STRING){
-   }
+ ZEND_HASH_FOREACH_KEY_VAL(ht, num_key, key, val) {
+   if (key) { //HASH_KEY_IS_STRING
+   }
    ........
-   zend_hash_move_forward_ex(&ht, &pos);
- }
+ } ZEND_HASH_FOREACH_END();

Objects

TODO: …

Custom Objects

TODO: …

zend_object struct is defined as:

struct _zend_object {
    zend_refcounted   gc;
    zend_uint         handle; // TODO: may be removed ???
    zend_class_entry *ce;
    const zend_object_handlers *handlers;
    HashTable        *properties;
    HashTable        *guards; /* protects from __get/__set ... recursion */
    zval              properties_table[1];
};

We inlined the properties_table for better access performance, but that also brings a problem, we used to define a custom object like this:

struct custom_object {
   zend_object std;
   void  *custom_data;
}
 
 
zend_object_value custom_object_new(zend_class_entry *ce TSRMLS_DC) {
 
   zend_object_value retval;
   struct custom_object *intern;
 
   intern = emalloc(sizeof(struct custom_object));
   zend_object_std_init(&intern->std, ce TSRMLS_CC);
   object_properties_init(&intern->std, ce);
   retval.handle = zend_objects_store_put(intern,
        (zend_objects_store_dtor_t)zend_objects_destroy_object,
        (zend_objects_free_object_storage_t) custom_free_storage, 
        NULL TSRMLC_CC);
   intern->handle = retval.handle;
   retval.handlers = &custom_object_handlers;
   return retval;
}
 
struct custom_object* obj = (struct custom_object *)zend_objects_get_address(getThis());

but now, zend_object is variable length now(inlined properties_table). thus above codes should be changed to:

struct custom_object {
   void  *custom_data;
   zend_object std;
}
 
zend_object * custom_object_new(zend_class_entry *ce TSRMLS_DC) {
     # Allocate sizeof(custom) + sizeof(properties table requirements)
     struct custom_object *intern = ecalloc(1, 
         sizeof(struct custom_object) + 
         zend_object_properties_size(ce));
     # Allocating:
     # struct custom_object {
     #    void *custom_data;
     #    zend_object std;
     # }
     # zval[ce->default_properties_count-1]
     zend_object_std_init(&intern->std, ce TSRMLS_CC);
     ...
     custom_object_handlers.offset = XtOffsetof(struct custom_obj, std);
     custom_object_handlers.free_obj = custom_free_storage;
 
     return &intern->std;
}
 
# Fetching the custom object:
 
static inline struct custom_object * php_custom_object_fetch_object(zend_object *obj) {
      return (struct custom_object *)((char *)obj - XtOffsetOf(struct custom_object, std));
}
 
#define Z_CUSTOM_OBJ_P(zv) php_custom_object_fetch_object(Z_OBJ_P(zv));
 
struct custom_object* obj = Z_CUSTOM_OBJ_P(getThis());

zend_object_handlers

a new item offset is added to zend_object_handlers, you should always define it as the offset of the zend_object in your custom object struct.

it is used by zend_objects_store_* to find the right start address of the allocated memory.

// An example in spl_array
memcpy(&spl_handler_ArrayObject, zend_get_std_object_handlers(), sizeof(zend_object_handlers));
spl_handler_ArrayObject.offset = XtOffsetOf(spl_array_object, std);

the memory of the object now will be released by zend_objects_store_*, thus you should not free the memory in you custom objects free_obj handler.

Resources

  • zvals of type IS_RESOURCE don't keep resource handle anymore. Resource handle can't be retrieved using Z_LVAL*(). Instead you should use Z_RES*() macro to retrieve the resource record directly. The resource record is represented by zend_resource structure. It contains type - resource type, ptr - pointer to actual data, handle - numeric resource index (for compatibility) and service fields for reference counter. Actually this zend_resurce structure is a replacement for indirectly referred zend_rsrc_list_entry. All occurances of zend_rsrc_list_entry should be replaced by zend_resource.
  • zend_list_find() function is removed, because resources are accessed directly.
- long handle = Z_LVAL_P(zv);
- int  type;
- void *ptr = zend_list_find(handle, &type);
+ long handle = Z_RES_P(zv)->handle;
+ int  type = Z_RES_P(zv)->type;
+ void *ptr = = Z_RES_P(zv)->ptr;
  • Z_RESVAL_*() macto is removed Z_RES*() may be used instead
- long handle = Z_RESVAL_P(zv);
+ long handle = Z_RES_P(zv)->handle;
  • ZEND_REGISTER_RESOURCE/ZEND_FETCH_RESOURCE() are droped
- ZEND_FETCH_RESOURCE2(ib_link, ibase_db_link *, &link_arg, link_id, LE_LINK, le_link, le_plink);

//if you are sure that link_arg is a IS_RESOURCE type, then use :
+if ((ib_link = (ibase_db_link *)zend_fetch_resource2(Z_RES_P(link_arg), LE_LINK, le_link, le_plink)) == NULL) {
+    RETURN_FALSE;
+}

//otherwise, if you know nothing about link_arg's type, use
+if ((ib_link = (ibase_db_link *)zend_fetch_resource2_ex(link_arg, LE_LINK, le_link, le_plink)) == NULL) {
+    RETURN_FALSE;
+}

- REGISTER_RESOURCE(return_value, result, le_result);
+ RETURN_RES(zend_register_resource(result, le_result);
  • zend_list_addref() and zend_list_delref() functions are removed. Resources use te same mechanism for reference counting as all zvals.
- zend_list_addref(Z_LVAL_P(zv));
+ Z_ADDREF_P(zv);

it's the same

- zend_list_addref(Z_LVAL_P(zv));
+ Z_RES_P(zv)->gc.refcount++;
  • zend_list_delete() takes pointer to zend_resource structure instead of resource handle
- zend_list_delete(Z_LVAL_P(zv));
+ zend_list_delete(Z_RES_P(zv));
  • In most user extension functions like mysql_close(), you should use zend_list_close() instead of zend_list_delete(). This closes the actual connection and frees extension specific data structures, but doesn't free the zend_reference structure. that might be still referenced from zval(s). This also doesn't decrement the resource reference counter.
- zend_list_delete(Z_LVAL_P(zv));
+ zend_list_close(Z_RES_P(zv));

Parameters Parsing API changes

  • PHPNG doesn't work with zval** anymore, so it doesn't need 'Z' specifier anymore. It must be replaced by 'z'.
- zval **pzv;
- if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "Z", &pzv) == FAILURE) {
+ zval *zv;
+ if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z", &zv) == FAILURE) {
  • in addition to 's' specifier that expects string, PHPNG introduced 'S' specifier that also expects string, but places argument into zend_string variable. In some cases direct usage of zend_string is preferred. (For example when received string used as a key in HashTable API.
- char *str;
- int len;
- if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s", &str, &len) == FAILURE) {
+ zend_string *str;
+ if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "S", &str) == FAILURE) {
  • '+' and '*' specifiers now return just array of zvals (instead of array of zval**s before)
- zval ***argv = NULL;
+ zval *argv = NULL;
  int argn;
  if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "+", &argv, &argn) == FAILURE) {
  • arguments passed by reference should be assigned into the referenced value. It's possible to separte such arguments, to get referenced value at first place.
- zval **ret;
- if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "Z", &ret) == FAILURE) {
+ zval *ret;
+ if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z/", &ret) == FAILURE) {
    return;
  }
- ZVAL_LONG(*ret, 0);
+ ZVAL_LONG(ret, 0);

Call Frame Changes (zend_execute_data)

Information about each function call recorded in a chain of zend_execute_data structures. EG(current_execute_data) points into call frame of currently executed functions (previously zend_execute_data structures were created only for user-level PHP functions). I'll try to explain the difference between old and new call frame structures field by field.

  • zend_execute_data.opline - instruction pointer of the currently executed user function. For internal functions its value is undefined. (previously for interanl functions its value was NULL)
  • zend_execute_data.function_state - this field was removed. zend_execute_data.call should be used instead.
  • zend_execute_data.call - previously it was a pointer to current call_slot. Currently it's a pointer to zend_execute_data of a currently calling function. This field is initially NULL, then it's changed by ZEND_INIT_FCALL (or similar) opcodes and then restored back by ZEND_FO_FCALL. Syntactically nested functions calls, like foo($a, bar($c)), construct a chain of such structures linked through zend_execute_data.prev_nested_call
  • zend_execute_data.op_array - this field was replaced by zend_execute_data.func, because now it may represent not only user functions but also internal ones.
  • zend_execute_data.func - currently executed function
  • zend_execute_data.object - $this of the currently executed function (previously it was a zval*, now it's a zend_object*)
  • zend_execute_data.symbol_table - current symbol table or NULL
  • zend_execute_data.prev_execute_data - link of backtrace call chain
  • original_return_value, current_scope, current_called_scope, current_this - these fields kept old values to restore them after call. Now they are removed.
  • zend_execute_data.scope - scope of the currently executed function (this is a new field).
  • zend_execute_data.called_scope - called_scope of the currently executed function (this is a new field).
  • zend_execute_data.run_time_cache - run-time-cache of the currently executed function. this is a new field and actually it's a copy of op_array.run_time_cache.
  • zend_execute_data.num_args - number of arguments passed to the function (this is a new field)
  • zend_execute_data.return_value - pointer to zval* where the currently executed op_array should store the result. It may be NULL if call doesn't care about return value. (this is a new field).

Arguments to functions stored in zval slots directly after zend_execute_data structure. they may be accessed using ZEND_CALL_ARG(execute_data, arg_num) macro. For user PHP functions first argument overlaps with first compiled cariable - CV0, etc. In case caller passes more arguments that callee receives, all extra arguments are copied to be after all used by calee CVs and TMP variables.

Executor Globals - EG() Changes

  • EG(symbol_table) - was turned to be a zend_array (previously it was a HashTable). It's not a bog problem to reach underlining HashTable
- symbols = zend_hash_num_elements(&EG(symbol_table));
+ symbols = zend_hash_num_elements(&EG(symbol_table).ht);
  • EG(uninitialized_zval_ptr) and EG(error_zval_ptr) were removed. Use &EG(uninitialized_zval) and &EG(error_zval) instead.
  • EG(current_execute_data) - the meaning of this field was changed a bit. Previously it was a pointer to call frame of last executed PHP function. Now it's a pointer to last executed call frame (never mind if it's user or internal function). It's possible to get the zend_execute_data structure for the last op_array traversing call chain list.
  zend_execute_data *ex = EG(current_execute_data);
+ while (ex && (!ex->func || !ZEND_USER_CODE(ex->func->type))) {
+    ex = ex->prev_execute_data;
+ }
  if (ex) {
  • EG(opline_ptr) - was removed. Use execute_data→opline instead.
  • EG(return_value_ptr_ptr) - was removed. Use execute_data→return_value instead.
  • EG(active_symbol_table) - was removed. Use execute_data→symbol_table instead.
  • EG(active_op_array) - was removed. Use execute_data→func instead.
  • EG(called_scope) - was removed. Use execute_data→called_scope instead.
  • EG(This) - was turned into zval, previously it was a pointer to zval. User code shouldn't modify it.
  • EG(in_execution) -was removed. If EG(current_excute_data) is not NULL, we are executing something.
  • EG(exception) and EG(prev_exception) - were turned to be pointers to zend_object, previously they were pointers to zval.

Opcodes changes

  • ZEND_DO_FCALL_BY_NAME - was removed, ZEND_INIT_FCALL_BY_NAME was added.
  • ZEND_BIND_GLOBAL - was added to handle “global $var”
  • ZEND_STRLEN - was added to replace strlen function
  • ZEND_TYPE_CHECK - was added to replace is_array/is_int/is_* if possible
  • ZEND_DEFINED - was added to replace zif_defined if possible (if only one parameter and it's constant string and it's not in namespace style)
  • ZEND_SEND_VAR_EX - was added to do more check than ZEND_SEND_VAR if the condition can not be settled in compiling time
  • ZEND_SEND_VAL_EX - was added to do more check than ZEND_SEND_VAL if the condition can not be settled in compiling time
  • ZEND_INIT_USER_CALL - was added to replace call_user_func(_array) if possible if the function can not be found in compiling time, otherwise it can convert to ZEND_INIT_FCALL
  • ZEND_SEND_ARRAY - was added to send the second parameter, the array of the call_user_func_array after it is converted to opcode
  • ZEND_SEND_USER - was added to send the the parameters of call_user_func after it is converted to opcode

temp_variable

PCRE

Some pcre APIs use or return zend_string now. F.e. php_pcre_replace returns a zend_string and takes a zend_string as 1st argument. Double check their declarations as well as compilers warnings, which are very likely about wrong arguments types.

phpng-upgrading.txt · Last modified: 2015/02/11 04:48 by laruence