====== Upgrading PHP extensions from PHP5 to NG ====== Many of the frequently used API functions have changed, such as the ''HashTable'' API; this page intends to document as many as possible of those changes that actually affect the way extension and core code is written. It's highly recommended to read the general information about PHPNG implementation at [[phpng-int]], before reading this guide. This is not a complete guide that covers every possible situation. This is a collection of prescriptions for most useful cases. I hope it must be enough for most user-level extensions. However if you did not find some information here, found a solution and think it may be useful for others - feel free to add your recipe. ===== General Advice ===== * Try to compile your extension with PHPNG. Look into compilation error and warnings. They must show you 75% of the places that have to be changed. * Compile and test extensions in debug mode (configure PHP with --enable-debug). It'll enable catching some error in run-time using assert(). You'll also see information about memory leaks. ===== zval ===== * PHPNG doesn't require any involvement of pointers to pointers to zval. Most occurrences of **zval%%**%%** variables and parameters have to be changed into **zval%%*%%**. The corresponding **Z_*_PP()** macros that work with such variables should be changed into **Z_*_P()**. * In many places PHPNG work with zval directly (eliminating need for allocation and deallocation). In these cases corresponding **zval%%*%%** variable should be converted into plain **zval**, macros that use this variable from **Z_*P()** into **Z_*()** and corresponding creation macros from **ZVAL_*(var, ...)** into **ZVAL_*(&var, ...)**. Be always careful about passing addresses of **zval** and **&** operator. PHPNG almost never require passing address of **zval%%*%%**. In some places **&** operator should be removed. * **zval** allocation macros **ALLOC_ZVAL**, **ALLOC_INIT_ZVAL**, **MAKE_STD_ZVAL** are removed. In most cases their usage indicate that **zval%%*%%** need to be changed into plain **zval**. Macro **INIT_PZVAL** is removed as well and its usages in most cases should be just removed. - zval *zv; - ALLOC_INIT_ZVAL(); - ZVAL_LONG(zv, 0); + zval zv; + ZVAL_LONG(&zv, 0); * The **zval** struct has completely changed. Now it's defined as: struct _zval_struct { zend_value value; /* value */ union { struct { ZEND_ENDIAN_LOHI_4( zend_uchar type, /* active type */ zend_uchar type_flags, zend_uchar const_flags, zend_uchar reserved) /* various IS_VAR flags */ } v; zend_uint type_info; } u1; union { zend_uint var_flags; zend_uint next; /* hash collision chain */ zend_uint str_offset; /* string offset */ zend_uint cache_slot; /* literal cache slot */ } u2; }; and ''zend_value'' as typedef union _zend_value { long lval; /* long value */ double dval; /* double value */ zend_refcounted *counted; zend_string *str; zend_array *arr; zend_object *obj; zend_resource *res; zend_reference *ref; zend_ast_ref *ast; zval *zv; void *ptr; zend_class_entry *ce; zend_function *func; } zend_value; * The main difference is that now we handle scalar and complex types differently. PHP doesn't allocate scalar values in heap but do it directly on VM stack, inside **HashTable**s and object. They are not subjects for reference counting and garbage collection anymore. Scalar values don't have reference counter and don't support **Z_ADDREF*()**, **Z_DELREF*()**, **Z_REFCOUNT*()** and **Z_SET_REFCOUNT*()** macros anymore. In most cases you should check if zval supports these macros before calling them. Otherwise you'll get an assert() or crash. - Z_ADDREF_P(zv) + if (Z_REFCOUNTED_P(zv)) {Z_ADDREF_P(zv);} # or equivalently + Z_TRY_ADDREF_P(zv); * **zval** values should be copied using **ZVAL_COPY_VALUE()** macro * It's possible to copy and increment reference counter if necessary using **ZVAL_COPY()** macro * Duplication of **zval** (**zval_copy_ctor**) may be done using **ZVAL_DUP()** macro * If you converted a ''zval*'' into a ''zval'' and previously used ''NULL'' to indicate an undefined value, you can now use the ''IS_UNDEF'' type instead. It can be set using ''ZVAL_UNDEF(&zv)'' and checked using ''if (Z_ISUNDEF(zv))''. * If you want to get the long/double/string value of a zval using cast-semantics without modifying the original zval you can now use the ''zval_get_long(zv)'', ''zval_get_double(zv)'' and ''zval_get_string(zv)'' APIs to simplify the code: - zval tmp; - ZVAL_COPY_VALUE(&tmp, zv); - zval_copy_ctor(&tmp); - convert_to_string(&tmp); - // ... - zval_dtor(&tmp); + zend_string *str = zval_get_string(zv); + // ... + zend_string_release(str); Look into zend_types.h code for more details: https://github.com/php/php-src/blob/master/Zend/zend_types.h ===== References ===== **zval** in PHPNG don't have **is_ref** flag anymore. References are implemented using a separate complex reference-counted type **IS_REFERENCE**. You may still use **Z_ISREF*()** macros to check if the given zval is reference. Actually, it just checks if type of the given zval equal to **IS_REFERENCE**. Macros that worked with **is_ref** flag are removed: **Z_SET_ISREF*()**, **Z_UNSET_ISREF*()** and **Z_SET_ISREF_TO*()**. Their usage should be changed in the following way: - Z_SET_ISREF_P(zv); + ZVAL_MAKE_REF(zv); - Z_UNSET_ISREF_P(zv); + if (Z_ISREF_P(zv)) {ZVAL_UNREF(zv);} Previously references might be directly checked for referenced type. Now we have to check it indirectly through **Z_REFVAL*()** macro - if (Z_ISREF_P(zv) && Z_TYPE_P(zv) == IS_ARRAY) { + if (Z_ISREF_P(zv) && Z_TYPE_P(Z_REFVAL_P(zv)) == IS_ARRAY) { or perform manual dereferencing using **ZVAL_DEREF()** macro - if (Z_ISREF_P(zv)) {...} - if (Z_TYPE_P(zv) == IS_ARRAY) { + if (Z_ISREF_P(zv)) {...} + ZVAL_DEREF(zv); + if (Z_TYPE_P(zv) == IS_ARRAY) { ===== Booleans ===== IS_BOOL does not exist anymore but IS_TRUE and IS_FALSE are types on their own: - if ((Z_TYPE_PP(item) == IS_BOOL || Z_TYPE_PP(item) == IS_LONG) && Z_LVAL_PP(item)) { + if (Z_TYPE_P(item) == IS_TRUE || (Z_TYPE_P(item) == IS_LONG && Z_LVAL_P(item))) { The **Z_BVAL*()** macros are removed. Be careful, the return value of **Z_LVAL*()** on IS_FALSE/IS_TRUE is undefined. ===== Strings ===== The value/length of the string may be accessed using the same macros **Z_STRVAL*()** and **Z_STRLEN*()**. However now the underlining data structure for string representation is **zend_string** (it's described in separate section). The **zend_string** may be retrieved from zval by **Z_STR*()** macro. It's also possible to get the hash value of the string through **Z_STRHASH*()**. In case code needs to check if the given string is interned or not, now it should be done using **zend_string** (not **char%%*%%**) - if (IS_INTERNED(Z_STRVAL_P(zv))) { + if (IS_INTERNED(Z_STR_P(zv))) { Creation of string zvals was a little bit changed. Previously macros like ZVAL_STRING() had an additional argument that told if the given characters should be duplicated or not. Now these macros always have to create **zend_string** structure so this parameter became useless. However if its actual value was 0, you have free the original string to avoid memory leak. - ZVAL_STRING(zv, str, 1); + ZVAL_STRING(zv, str); - ZVAL_STRINGL(zv, str, len, 1); + ZVAL_STRINGL(zv, str, len); - ZVAL_STRING(zv, str, 0); + ZVAL_STRING(zv, str); + efree(str); - ZVAL_STRINGL(zv, str, len, 0); + ZVAL_STRINGL(zv, str, len); + efree(str); The same is true for similar macros like **RETURN_STRING()**, **RETVAL_STRNGL()**, etc and some internal API functions. - add_assoc_string(zv, key, str, 1); + add_assoc_string(zv, key, str); - add_assoc_string(zv, key, str, 0); + add_assoc_string(zv, key, str); + efree(str); The double reallocation may be avoided using **zend_string** API directly and creating zval directly from **zend_string**. - char * str = estrdup("Hello"); - RETURN_STRING(str); + zend_string *str = zend_string_init("Hello", sizeof("Hello")-1, 0); + RETURN_STR(str); **Z_STRVAL*()** now should be used as read-only object. It's not possible to assign anything into it. It's possible to modify separate characters, but before doing it you must be sure that this string is not referred form everywhere else (it is not interned and its reference-counter is 1). Also after in-place string modification you might need to reset calculated hash value. SEPARATE_ZVAL(zv); Z_STRVAL_P(zv)[0] = Z_STRVAL_P(zv)[0] + ('A' - 'a'); + zend_string_forget_hash_val((Z_STR_P(zv)) ===== zend_string API ===== Zend has a new **zend_string** API, except that **zend_string** is underlining structure for string representation in **zval**, these structures are also used throughout much of the codebase where **char%%*%%** and **int** were used before. **zend_strings** (not **IS_STRING** zvals) may be created using **zend_string_init(char *val, size_t len, int persistent)** function. The actual characters may be accessed as **str->val** and string length as **str->len**. The hash value of the string should be accessed through **zend_string_hash_val** function. It'll re-calculate hash value if necessary. Strings should be deallocated using **zend_string_release()** function, that doesn't necessary free memory, because the same string may be referenced from few places. If you are going to keep **zend_string** pointer somewhere you should increase it reference-counter or use **zend_string_copy()** function that will do it for you. In many places where code copied characters just to keep value (not to modify) it's possible to use this function instead. - ptr->str = estrndup(Z_STRVAL_P(zv), Z_STRLEN_P(zv)); + ptr->str = zend_string_copy(Z_STR_P(zv)); ... - efree(str); + zend_string_release(str); In case the copied string is going to be changed you may use **zend_string_dup()** instead - char *str = estrndup(Z_STRVAL_P(zv), Z_STRLEN_P(zv)); + zend_string *str = zend_string_dup(Z_STR_P(zv)); ... - efree(str); + zend_string_release(str); The code with old macros must be supported as well, so switching to the new ones is not necessary. In some cases it makes sense to allocate string buffer before the actual string data is known. You may use **zend_string_alloc()** and **zend_string_realloc()** functions to do it. - char *ret = emalloc(16+1); - md5(something, ret); - RETURN_STRINGL(ret, 16, 0); + zend_string *ret = zend_string_alloc(16, 0); + md5(something, ret->val); + RETURN_STR(ret); Not all of the extensions code have to be converted to use **zend_string** instead of **char%%*%%**. It's up to extensions maintainer to decide which type is more suitable in each particular case. Look into zend_string.h code for more details: https://github.com/php/php-src/blob/master/Zend/zend_string.h =====smart_str and smart_string===== For consistent naming convention the old **smart_str** API was renamed into **smart_string**. it may be used as before except for new names. - smart_str str = {0}; - smart_str_appendl(str, " ", sizeof(" ") - 1); - smart_str_0(str); - RETURN_STRINGL(implstr.c, implstr.len, 0); + smart_string str = {0}; + smart_string_appendl(str, " ", sizeof(" ") - 1); + smart_string_0(str); + RETVAL_STRINGL(str.c, str.len); + smart_string_free(&str); In addition we introduced a new **zend_str** API that works with **zend_string** directly - smart_str str = {0}; - smart_str_appendl(str, " ", sizeof(" ") - 1); - smart_str_0(str); - RETURN_STRINGL(implstr.c, implstr.len, 0); + smart_str str = {0}; + smart_str_appendl(&str, " ", sizeof(" ") - 1); + smart_str_0(&str); + if (str.s) { + RETURN_STR(str.s); + } else { + RETURN_EMPTY_STRING(); + } ''smart_str'' defined as typedef struct { zend_string *s; size_t a; } smart_str; The API of both **smart_str** and **smart_string** are very similar and actually they repeat the API used in PHP5. So it must not be a big problems to adopt the code. the biggest question what AI to select for each particular case, but it depends the way the final result is used. Note that the previously check for a empty smart_str might need to be changed - if (smart_str->c) { + if (smart_str->s) { =====strpprintf===== In addition to **spprintf()** and **vspprintf()** functions we introduced similar functions that produce **zend_string** instead **char%%*%%**. it's up to you to decide when you should change to the new variants. PHPAPI zend_string *vstrpprintf(size_t max_len, const char *format, va_list ap); PHPAPI zend_string *strpprintf(size_t max_len, const char *format, ...); ===== Arrays ===== Arrays implemented more or less the same, however, if previously the underlining structure was imlemented as a pointer to **HashTable** now we have here a pointer to **zend_array** that keep **HashTable** inside. The HashTable may be read as before using **Z_ARRVAL*()** macros, but now it's not possible to change pointer to **HashTable**. It's only possible to get/set pointer to the whole **zend_array** through macro **Z_ARR*()**. The best way to create arrays is to use old **array_init()** function, but it's also possible to create new uninitialized arrays using **ZVAL_NEW_ARR()** or initialize it using **zend_array** structure through **ZVAL_ARR()** Some arrays might be immutable (may be checked using **Z_IMMUTABLE()** macro). And in case code need to modify them, they have to be duplicated first. Iteration through immutable arrays using internal ''position pointer'' is not possible as well. It's possible to walk through such arrays using old iteration API with external ''position pointer'' or using new HashTable iteration API described in separate section. ===== HashTable API ===== HashTable API was changed significantly, and it may cause some troubles in extensions porting. * First of all now HashTables always keep **zval**s. Even if we store an arbitrary pointer, it's packed into **zval** with special type **IS_PTR**. Anyway, this simplifies work with **zval** - zend_hash_update(ht, Z_STRVAL_P(key), Z_STRLEN_P(key)+1, (void*)&zv, sizeof(zval**), NULL) == SUCCESS) { + if (zend_hash_update(EG(function_table), Z_STR_P(key), zv)) != NULL) { * Most API functions returns requested values directly (instead of using additional by reference argument and returning SUCCESS/FAILURE). - if (zend_hash_find(ht, Z_STRVAL_P(key), Z_STRLEN_P(key)+1, (void**)&zv_ptr) == SUCCESS) { + if ((zv = zend_hash_find(ht, Z_STR_P(key))) != NULL) { * Keys are represented as **zend_string**. Most functions have two forms. One receives a **zend_string** as key and the other a **char%%*%%, length** pair. * **Important Note:** Length of the key string does not include trailing zero. In some places +1/-1 has to be removed/added: - if (zend_hash_find(ht, "value", sizeof("value"), (void**)&zv_ptr) == SUCCESS) { + if ((zv = zend_hash_str_find(ht, "value", sizeof("value")-1)) != NULL) { This also applies to other hashtable-related APIs outside of zend_hash. For example: - add_assoc_bool_ex(&zv, "valid", sizeof("valid"), 0); + add_assoc_bool_ex(&zv, "valid", sizeof("valid") - 1, 0); * API provides a separate group of functions to work with arbitrary pointers. Such functions have the same names with **_ptr** suffix. - if (zend_hash_find(EG(class_table), Z_STRVAL_P(key), Z_STRLEN_P(key)+1, (void**)&ce_ptr) == SUCCESS) { + if ((ce_ptr = zend_hash_find_ptr(EG(class_table), Z_STR_P(key))) != NULL) { - zend_hash_update(EG(class_table), Z_STRVAL_P(key), Z_STRLEN_P(key)+1, (void*)&ce, sizeof(zend_class_entry*), NULL) == SUCCESS) { + if (zend_hash_update_ptr(EG(class_table), Z_STR_P(key), ce)) != NULL) { * API provides a separate group of functions to store memory blocks of arbitrary size. Such functions have the same names with **_mem** suffix and they implemented as inline wrappers of corresponding **_ptr** functions. It doesn't mean if something was stored using **_mem** or **_ptr** variant. It always may be retrieved back using **zend_hash_find_ptr()**. - zend_hash_update(EG(function_table), Z_STRVAL_P(key), Z_STRLEN_P(key)+1, (void*)func, sizeof(zend_function), NULL) == SUCCESS) { + if (zend_hash_update_mem(EG(function_table), Z_STR_P(key), func, sizeof(zend_function))) != NULL) { * few new optimized functions for new element insertion were added. They are intended to be used in situations when code adds only new elements, that can't overlap with already existing keys. For example when you copy some elements of one HashTable into a new one. All such functions have **_new** suffix. zval* zend_hash_add_new(HashTable *ht, zend_string *key, zval *zv); zval* zend_hash_str_add_new(HashTable *ht, char *key, int len, zval *zv); zval* zend_hash_index_add_new(HashTable *ht, pzval *zv); zval* zend_hash_next_index_insert_new(HashTable *ht, pzval *zv); void* zend_hash_add_new_ptr(HashTable *ht, zend_string *key, void *pData); ... * HashTable destructors now always receive **zval%%*%%** (even if we use **zend_hash_add_ptr** or **zend_hash_add_mem** to add elements). **Z_PTR_P()** macro may be used to reach the actual pointer value in destructors. Also, if elements are added using **zend_hash_add_mem**, destructor is also responsible for deallocation of the pointers themselves. - void my_ht_destructor(void *ptr) + void my_ht_destructor(zval *zv) { - my_ht_el_t *p = (my_ht_el_t*) ptr; + my_ht_el_t *p = (my_ht_el_t*) Z_PTR_P(zv); ... + efree(p); // this efree() is not always necessary } ); * Callbacks for all **zend_hash_apply_%%*%%()** functions, as well as callbacks for **zend_hash_copy()** and **zend_hash_merge()**, should be changed to receive **zval%%*%% instead of **void%%*%%&& in the same way as destructors. Some of these functions also receive pointer to **zend_hash_key** structure. It's definition was changed in the following way. For ''string'' keys ''h'' contains a value of hash function and ''key'' the actual string. For ''integer'' keys ''h'' contains numeric key value, and ''key'' is NULL. typedef struct _zend_hash_key { ulong h; zend_string *key; } zend_hash_key; In some cases, it makes sense to change usage of **zend_hash_apply*()** functions into usage of new HashTable iteration API. This may lead to smaller and more efficient code. Reviewing zend_hash.h is a very good idea: https://github.com/php/php-src/blob/master/Zend/zend_hash.h ===== HashTable Iteration API ===== We provide few specialized macros to iterate through elements (and keys) of HashTables. The first argument of the macros is the hashtables, the others are variables to be assigned on each iteration step. * ''ZEND_HASH_FOREACH_VAL(ht, val)'' * ''ZEND_HASH_FOREACH_KEY(ht, h, key)'' * ''ZEND_HASH_FOREACH_PTR(ht, ptr)'' * ''ZEND_HASH_FOREACH_NUM_KEY(ht, h)'' * ''ZEND_HASH_FOREACH_STR_KEY(ht, key)'' * ''ZEND_HASH_FOREACH_STR_KEY_VAL(ht, key, val)'' * ''ZEND_HASH_FOREACH_KEY_VAL(ht, h, key, val)'' The best suitable macro should be used instead of the old reset, current, and move functions. - HashPosition pos; ulong num_key; - char *key; - uint key_len; + zend_string *key; - zval **pzv; + zval *zv; - - zend_hash_internal_pointer_reset_ex(&ht, &pos); - while (zend_hash_get_current_data_ex(&ht, (void**)&ppzval, &pos) == SUCCESS) { - if (zend_hash_get_current_key_ex(&ht, &key, &key_len, &num_key, 0, &pos) == HASH_KEY_IS_STRING){ - } + ZEND_HASH_FOREACH_KEY_VAL(ht, num_key, key, val) { + if (key) { //HASH_KEY_IS_STRING + } ........ - zend_hash_move_forward_ex(&ht, &pos); - } + } ZEND_HASH_FOREACH_END(); ===== Objects ===== TODO: ... ===== Custom Objects ===== TODO: ... ''zend_object'' struct is defined as: struct _zend_object { zend_refcounted gc; zend_uint handle; // TODO: may be removed ??? zend_class_entry *ce; const zend_object_handlers *handlers; HashTable *properties; HashTable *guards; /* protects from __get/__set ... recursion */ zval properties_table[1]; }; We inlined the properties_table for better access performance, but that also brings a problem, we used to define a custom object like this: struct custom_object { zend_object std; void *custom_data; } zend_object_value custom_object_new(zend_class_entry *ce TSRMLS_DC) { zend_object_value retval; struct custom_object *intern; intern = emalloc(sizeof(struct custom_object)); zend_object_std_init(&intern->std, ce TSRMLS_CC); object_properties_init(&intern->std, ce); retval.handle = zend_objects_store_put(intern, (zend_objects_store_dtor_t)zend_objects_destroy_object, (zend_objects_free_object_storage_t) custom_free_storage, NULL TSRMLC_CC); intern->handle = retval.handle; retval.handlers = &custom_object_handlers; return retval; } struct custom_object* obj = (struct custom_object *)zend_objects_get_address(getThis()); but now, zend_object is variable length now(inlined properties_table). thus above codes should be changed to: struct custom_object { void *custom_data; zend_object std; } zend_object * custom_object_new(zend_class_entry *ce TSRMLS_DC) { # Allocate sizeof(custom) + sizeof(properties table requirements) struct custom_object *intern = ecalloc(1, sizeof(struct custom_object) + zend_object_properties_size(ce)); # Allocating: # struct custom_object { # void *custom_data; # zend_object std; # } # zval[ce->default_properties_count-1] zend_object_std_init(&intern->std, ce TSRMLS_CC); ... custom_object_handlers.offset = XtOffsetOf(struct custom_obj, std); custom_object_handlers.free_obj = custom_free_storage; intern->std.handlers = custom_object_handlers; return &intern->std; } # Fetching the custom object: static inline struct custom_object * php_custom_object_fetch_object(zend_object *obj) { return (struct custom_object *)((char *)obj - XtOffsetOf(struct custom_object, std)); } #define Z_CUSTOM_OBJ_P(zv) php_custom_object_fetch_object(Z_OBJ_P(zv)); struct custom_object* obj = Z_CUSTOM_OBJ_P(getThis()); ====== zend_object_handlers ====== a new item offset is added to zend_object_handlers, you should always define it as the offset of the zend_object in your custom object struct. it is used by zend_objects_store_* to find the right start address of the allocated memory. // An example in spl_array memcpy(&spl_handler_ArrayObject, zend_get_std_object_handlers(), sizeof(zend_object_handlers)); spl_handler_ArrayObject.offset = XtOffsetOf(spl_array_object, std); the memory of the object now will be released by zend_objects_store_*, thus you should not free the memory in you custom objects free_obj handler. ===== Resources ===== * **zval**s of type **IS_RESOURCE** don't keep resource handle anymore. Resource handle can't be retrieved using **Z_LVAL*()**. Instead you should use **Z_RES*()** macro to retrieve the resource record directly. The resource record is represented by **zend_resource** structure. It contains ''type'' - resource type, ''ptr'' - pointer to actual data, ''handle'' - numeric resource index (for compatibility) and service fields for reference counter. Actually this **zend_resurce** structure is a replacement for indirectly referred **zend_rsrc_list_entry**. All occurrences of **zend_rsrc_list_entry** should be replaced by **zend_resource**. * **zend_list_find()** function is removed, because resources are accessed directly. - long handle = Z_LVAL_P(zv); - int type; - void *ptr = zend_list_find(handle, &type); + long handle = Z_RES_P(zv)->handle; + int type = Z_RES_P(zv)->type; + void *ptr = = Z_RES_P(zv)->ptr; * **Z_RESVAL_*()** macro is removed **Z_RES*()** may be used instead - long handle = Z_RESVAL_P(zv); + long handle = Z_RES_P(zv)->handle; * **ZEND_REGISTER_RESOURCE/ZEND_FETCH_RESOURCE()** are dropped - ZEND_FETCH_RESOURCE2(ib_link, ibase_db_link *, &link_arg, link_id, LE_LINK, le_link, le_plink); //if you are sure that link_arg is a IS_RESOURCE type, then use : +if ((ib_link = (ibase_db_link *)zend_fetch_resource2(Z_RES_P(link_arg), LE_LINK, le_link, le_plink)) == NULL) { + RETURN_FALSE; +} //otherwise, if you know nothing about link_arg's type, use +if ((ib_link = (ibase_db_link *)zend_fetch_resource2_ex(link_arg, LE_LINK, le_link, le_plink)) == NULL) { + RETURN_FALSE; +} - REGISTER_RESOURCE(return_value, result, le_result); + RETURN_RES(zend_register_resource(result, le_result)); * **zend_list_addref()** and **zend_list_delref()** functions are removed. Resources use te same mechanism for reference counting as all **zval**s. - zend_list_addref(Z_LVAL_P(zv)); + Z_ADDREF_P(zv); it's the same - zend_list_addref(Z_LVAL_P(zv)); + Z_RES_P(zv)->gc.refcount++; * **zend_list_delete()** takes pointer to **zend_resource** structure instead of resource handle - zend_list_delete(Z_LVAL_P(zv)); + zend_list_delete(Z_RES_P(zv)); * In most user extension functions like mysql_close(), you should use **zend_list_close()** instead of **zend_list_delete()**. This closes the actual connection and frees extension specific data structures, but doesn't free the **zend_reference** structure. that might be still referenced from zval(s). This also doesn't decrement the resource reference counter. - zend_list_delete(Z_LVAL_P(zv)); + zend_list_close(Z_RES_P(zv)); ===== Parameters Parsing API changes ===== * The **'l'** specifier now expects a ''zend_long'' argument, instead of a ''long'' argument. - long lval; + zend_long lval; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "l", &lval) == FAILURE) { * The length argument of the **'s'** specifier now expects a ''size_t'' variable instead of an ''int'' variable. char *str; - int len; + size_t len; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s", &str, &len) == FAILURE) { * In addition to **'s'** specifier that expects string, PHPNG introduced **'S'** specifier that also expects string, but places argument into **zend_string** variable. In some cases direct usage of **zend_string** is preferred. (For example when received string used as a key in HashTable API. - char *str; - int len; - if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s", &str, &len) == FAILURE) { + zend_string *str; + if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "S", &str) == FAILURE) { * PHPNG doesn't work with **zval%%**%%** anymore, so it doesn't need **'Z'** specifier anymore. It must be replaced by **'z'**. - zval **pzv; - if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "Z", &pzv) == FAILURE) { + zval *zv; + if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z", &zv) == FAILURE) { * **'+'** and **'*'** specifiers now return just array of **zval**s (instead of array of **zval%%**%%**s before) - zval ***argv = NULL; + zval *argv = NULL; int argn; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "+", &argv, &argn) == FAILURE) { * arguments passed by reference should be assigned into the referenced value. It's possible to separate such arguments, to get referenced value at first place. - zval **ret; - if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "Z", &ret) == FAILURE) { + zval *ret; + if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z/", &ret) == FAILURE) { return; } - ZVAL_LONG(*ret, 0); + ZVAL_LONG(ret, 0); ===== Call Frame Changes (zend_execute_data) ===== Information about each function call recorded in a chain of zend_execute_data structures. EG(current_execute_data) points into ''call frame'' of currently executed functions (previously zend_execute_data structures were created only for user-level PHP functions). I'll try to explain the difference between old and new call frame structures field by field. * **zend_execute_data.opline** - instruction pointer of the currently executed user function. For internal functions its value is undefined. (previously for internal functions its value was NULL) * **zend_execute_data.function_state** - this field was removed. **zend_execute_data.call** should be used instead. * **zend_execute_data.call** - previously it was a pointer to current **call_slot**. Currently it's a pointer to **zend_execute_data** of a currently calling function. This field is initially NULL, then it's changed by ZEND_INIT_FCALL (or similar) opcodes and then restored back by ZEND_FO_FCALL. Syntactically nested functions calls, like foo($a, bar($c)), construct a chain of such structures linked through **zend_execute_data.prev_nested_call** * **zend_execute_data.op_array** - this field was replaced by **zend_execute_data.func**, because now it may represent not only user functions but also internal ones. * **zend_execute_data.func** - currently executed function * **zend_execute_data.object** - $this of the currently executed function (previously it was a **zval%%*%%**, now it's a **zend_object%%*%%**) * **zend_execute_data.symbol_table** - current symbol table or NULL * **zend_execute_data.prev_execute_data** - link of backtrace call chain * **original_return_value, current_scope, current_called_scope, current_this** - these fields kept old values to restore them after call. Now they are removed. * **zend_execute_data.scope** - scope of the currently executed function (this is a new field). * **zend_execute_data.called_scope** - called_scope of the currently executed function (this is a new field). * **zend_execute_data.run_time_cache** - run-time-cache of the currently executed function. this is a new field and actually it's a copy of op_array.run_time_cache. * **zend_execute_data.num_args** - number of arguments passed to the function (this is a new field) * **zend_execute_data.return_value** - pointer to **zval%%*%%** where the currently executed op_array should store the result. It may be NULL if call doesn't care about return value. (this is a new field). Arguments to functions stored in **zval** slots directly after **zend_execute_data** structure. they may be accessed using **ZEND_CALL_ARG(execute_data, arg_num)** macro. For user PHP functions first argument overlaps with first ''compiled variable'' - CV0, etc. In case caller passes more arguments that callee receives, all extra arguments are copied to be after all used by callee CVs and TMP variables. ===== Executor Globals - EG() Changes ===== * **EG(symbol_table)** - was turned to be a **zend_array** (previously it was a **HashTable**). It's not a big problem to reach underlining HashTable - symbols = zend_hash_num_elements(&EG(symbol_table)); + symbols = zend_hash_num_elements(&EG(symbol_table).ht); * **EG(uninitialized_zval_ptr)** and **EG(error_zval_ptr)** were removed. Use **&EG(uninitialized_zval)** and **&EG(error_zval)** instead. * **EG(current_execute_data)** - the meaning of this field was changed a bit. Previously it was a pointer to call frame of last executed PHP function. Now it's a pointer to last executed call frame (never mind if it's user or internal function). It's possible to get the **zend_execute_data** structure for the last op_array traversing call chain list. zend_execute_data *ex = EG(current_execute_data); + while (ex && (!ex->func || !ZEND_USER_CODE(ex->func->type))) { + ex = ex->prev_execute_data; + } if (ex) { * **EG(opline_ptr)** - was removed. Use **execute_data->opline** instead. * **EG(return_value_ptr_ptr)** - was removed. Use **execute_data->return_value** instead. * **EG(active_symbol_table)** - was removed. Use **execute_data->symbol_table** instead. * **EG(active_op_array)** - was removed. Use **execute_data->func** instead. * **EG(called_scope)** - was removed. Use **execute_data->called_scope** instead. * **EG(This)** - was turned into **zval**, previously it was a pointer to **zval**. User code shouldn't modify it. * **EG(in_execution)** -was removed. If EG(current_excute_data) is not NULL, we are executing something. * **EG(exception)** and **EG(prev_exception)** - were turned to be pointers to **zend_object**, previously they were pointers to **zval**. =====Opcodes changes===== * ZEND_DO_FCALL_BY_NAME - was removed, ZEND_INIT_FCALL_BY_NAME was added. * ZEND_BIND_GLOBAL - was added to handle "global $var" * ZEND_STRLEN - was added to replace strlen function * ZEND_TYPE_CHECK - was added to replace is_array/is_int/is_* if possible * ZEND_DEFINED - was added to replace zif_defined if possible (if only one parameter and it's constant string and it's not in namespace style) * ZEND_SEND_VAR_EX - was added to do more check than ZEND_SEND_VAR if the condition can not be settled in compiling time * ZEND_SEND_VAL_EX - was added to do more check than ZEND_SEND_VAL if the condition can not be settled in compiling time * ZEND_INIT_USER_CALL - was added to replace call_user_func(_array) if possible if the function can not be found in compiling time, otherwise it can convert to ZEND_INIT_FCALL * ZEND_SEND_ARRAY - was added to send the second parameter, the array of the call_user_func_array after it is converted to opcode * ZEND_SEND_USER - was added to send the the parameters of call_user_func after it is converted to opcode =====temp_variable===== ===== PCRE ===== Some pcre APIs use or return zend_string now. F.e. php_pcre_replace returns a zend_string and takes a zend_string as 1st argument. Double check their declarations as well as compilers warnings, which are very likely about wrong arguments types.